[SPARK-1476] 2GB limit in spark for blocks - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Critical
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: Spark Core
Labels:
None
Environment:

all

Description

The underlying abstraction for blocks in spark is a ByteBuffer : which limits the size of the block to 2GB.
This has implication not just for managed blocks in use, but also for shuffle blocks (memory mapped blocks are limited to 2gig, even though the api allows for long), ser-deser via byte array backed outstreams (~~SPARK-1391~~), etc.

This is a severe limitation for use of spark when used on non trivial datasets.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

2g_fix_proposal.pdf
24/Jun/14 21:10
76 kB
Mridul Muralidharan

Issue Links

breaks

SPARK-1353 IllegalArgumentException when writing to disk

Resolved

depends upon

SPARK-1391 BlockManager cannot transfer blocks larger than 2G in size

Closed

is duplicated by

SPARK-1353 IllegalArgumentException when writing to disk

Resolved

relates to

SPARK-6190 create LargeByteBuffer abstraction for eliminating 2GB limit on blocks

Resolved

SPARK-3151 DiskStore attempts to map any size BlockId without checking MappedByteBuffer limit

Resolved

requires

SPARK-5928 Remote Shuffle Blocks cannot be more than 2 GB

Resolved

SPARK-1391 BlockManager cannot transfer blocks larger than 2G in size

Closed

(2 requires)

Activity

People

Assignee:: Unassigned

Reporter:: Mridul Muralidharan

Votes:: 16 Vote for this issue

Watchers:: 56 Start watching this issue

Dates

Created:: 12/Apr/14 06:29

Updated:: 18/Dec/19 02:38

Resolved:: 26/Jun/15 17:15