Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Intermediate data addressable by key support not only restartable streams, but partitioning after the map output are written. With sampling of map output, a job can implement a total-order and tune the number of reduces around skew. It is possible to implement something similar for non-memcmp types, but it is significantly more complex.
Attachments
Issue Links
- is part of
-
MAPREDUCE-4584 Umbrella: Preemption and restart of MapReduce tasks
- Open