Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12007

Network library's RPC layer requires a lot of copying

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.0
    • 1.6.0
    • Spark Core
    • None

    Description

      The network library's RPC layer has an external API based on byte arrays, instead of ByteBuffer; that requires a lot of copying since the internals of the library use ByteBuffers (or rather Netty's ByteBuf), and lots of external clients also use ByteBuffer.

      The extra copies could be avoided if the API used ByteBuffer instead.

      To show an extreme case, look at an RPC send via NettyRpcEnv:

      • message is encoded using JavaSerializer, resulting in a ByteBuffer
      • the ByteBuffer is copied into a byte array of the right size, since its internal array may be larger than the actual data it holds
      • the network library's encoder copies the byte array into a ByteBuf
      • finally the data is written to the socket

      The intermediate 2 copies could be avoided if the API allowed the original ByteBuffer to be sent instead.

      Attachments

        Activity

          People

            vanzin Marcelo Masiero Vanzin
            vanzin Marcelo Masiero Vanzin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: