Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1476

2GB limit in spark for blocks

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Critical
    • Resolution: Duplicate
    • None
    • None
    • Spark Core
    • None
    • all

    Description

      The underlying abstraction for blocks in spark is a ByteBuffer : which limits the size of the block to 2GB.
      This has implication not just for managed blocks in use, but also for shuffle blocks (memory mapped blocks are limited to 2gig, even though the api allows for long), ser-deser via byte array backed outstreams (SPARK-1391), etc.

      This is a severe limitation for use of spark when used on non trivial datasets.

      Attachments

        1. 2g_fix_proposal.pdf
          76 kB
          Mridul Muralidharan

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mridulm80 Mridul Muralidharan
              Votes:
              16 Vote for this issue
              Watchers:
              56 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: