Improving Multi-Gpu Strong Scaling Through Optimization Of Fine-Grained Transfers