Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Map side combine in group by key case does not reduce the amount of data shuffled. Instead, it forces a lot more objects to go into old gen, and leads to worse GC.
Attachments
Issue Links
- is related to
-
SPARK-774 cogroup should also disable map side combine by default
- Resolved
-
SPARK-31948 expose mapSideCombine in aggByKey/reduceByKey/foldByKey
- Resolved