Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Duplicate
-
None
-
None
-
None
-
all
Description
The underlying abstraction for blocks in spark is a ByteBuffer : which limits the size of the block to 2GB.
This has implication not just for managed blocks in use, but also for shuffle blocks (memory mapped blocks are limited to 2gig, even though the api allows for long), ser-deser via byte array backed outstreams (SPARK-1391), etc.
This is a severe limitation for use of spark when used on non trivial datasets.
Attachments
Attachments
Issue Links
- breaks
-
SPARK-1353 IllegalArgumentException when writing to disk
- Resolved
- depends upon
-
SPARK-1391 BlockManager cannot transfer blocks larger than 2G in size
- Closed
- is duplicated by
-
SPARK-1353 IllegalArgumentException when writing to disk
- Resolved
- relates to
-
SPARK-6190 create LargeByteBuffer abstraction for eliminating 2GB limit on blocks
- Resolved
-
SPARK-3151 DiskStore attempts to map any size BlockId without checking MappedByteBuffer limit
- Resolved
- requires
-
SPARK-5928 Remote Shuffle Blocks cannot be more than 2 GB
- Resolved
-
SPARK-1391 BlockManager cannot transfer blocks larger than 2G in size
- Closed