Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3115 Improve task broadcast latency for small tasks
  3. SPARK-3132

Avoid serialization for Array[Byte] in TorrentBroadcast

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • None
    • None
    • Spark Core
    • None

    Description

      If the input data is a byte array, we should allow TorrentBroadcast to skip serializing and compressing the input.

      To do this, we should add a new parameter (shortCircuitByteArray) to TorrentBroadcast, and then avoid serialization in if the input is byte array and shortCircuitByteArray is true.

      We should then also do compression in task serialization itself instead of doing it in TorrentBroadcast.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rxin Reynold Xin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: