Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6235 Address various 2G limits
  3. SPARK-24296

Support replicating blocks larger than 2 GB

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.4.0
    • Component/s: Block Manager, Spark Core
    • Labels:
      None

      Description

      Replicating blocks send the entire block data in one frame. This results in a failure on the receiving end for blocks larger than 2GB.

      We should change block replication to send the block data as a stream when the block is large (building on the network changes in SPARK-6237). This can use the conf spark.maxRemoteBlockSizeFetchToMem to decided when to replicate as a stream, the same as we do for fetching shuffle blocks and fetching remote RDD blocks.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              irashid Imran Rashid
              Reporter:
              irashid Imran Rashid

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment