They key question is what is it that takes the time in clonding and can that be multi-threaded.
If it's the netwrok traffic that takes the most time, where is the bottleneck?
Is it in the server software assembling what will be sent? Is it in the receiving software processing it? If so, multiple threads could help.
Is it in network bandwidth? If so doing multiple connections won't help much.
TCP connections favour a few connections passing a lot of data rather than many connections passing a little. The one place where multiple connections can help is when you have non-congestion induced packet loss as a lost packet on a connection will cause the throughput of that connection to drop (if the drop is due to congestion, this is TCP working as designed, throttling back to match the available bandwidth). This can be a significant effect if you have a very high bandwidth, high latency connection (think multiple Gb on international
connections), but for lower bandwidth connections it's much less of a factor. You can look at projects like bbcp
I think it's an interesting question to look at, but before you start looking at changing the architecture of the current code, I would suggest doing a bit more analisys of the problem to see if the bottleneck is really where you think it is.