When netty transfer data that is not from FileRegion, data will be transfered as ByteBuf, If the data is large, there will occur significant performance issue because there is memory copy underlying in sun.nio.ch.IOUtil.write, the CPU is 100% used, and network is very low. We can check it by comparing NIO and Netty for spark.shuffle.blockTransferService in spark 1.4. NIO network bandwidth is much better than Netty.
How to reproduce:
The root cause can referred here.