Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
1. Consider increasing the number of threads in spill executor. TEZ_RUNTIME_UNORDERED_OUTPUT_MAX_PER_BUFFER_SIZE_BYTES can be used to configure the buffer size. If smaller buffer sizes are provided, there is a chance of getting frequent spills; currently the spill executor operates in single threaded mode.
2. During profiling, things like incrementing the counters, notifying progress came up. This may not be common in regular tez jobs. But in processes like LLAP (hive based), it is possible to get into such situations. I will attach the profiler snapshot showing this. It would be good to update/notify less frequently.
3. Optimize mergeAll().