Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Won't Fix
-
0.17.0
-
None
-
None
-
None
-
all
Description
Currently, there's no easy way for the JobInProgress to know how large the job's input data is.
This patch corrects the problem, by storing the size of the input split's data through the RawSplit. The sizes of each split are then totaled up and made available via JobInProgress.getInputSize().
This is needed, among other reasons, so that the JobInProgress knows how much data it's being run on, which will help build smarter schedulers.
Attachments
Attachments
Issue Links
- is part of
-
HADOOP-657 Free temporary space should be modelled better
- Closed