Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6235 Address various 2G limits
  3. SPARK-24296

Support replicating blocks larger than 2 GB

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.4.0
    • Block Manager, Spark Core
    • None

    Description

      Replicating blocks send the entire block data in one frame. This results in a failure on the receiving end for blocks larger than 2GB.

      We should change block replication to send the block data as a stream when the block is large (building on the network changes in SPARK-6237). This can use the conf spark.maxRemoteBlockSizeFetchToMem to decided when to replicate as a stream, the same as we do for fetching shuffle blocks and fetching remote RDD blocks.

      Attachments

        Issue Links

          Activity

            People

              irashid Imran Rashid
              irashid Imran Rashid
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: