Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
None
Description
To improve Hadoop job startup time we can multithread parts of the input format. Specifically the fetching of "sub splits" from many nodes can be run in parallel.