Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3441

Pass the size of the MapReduce input to JobInProgress

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 0.17.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      all

      Description

      Currently, there's no easy way for the JobInProgress to know how large the job's input data is.

      This patch corrects the problem, by storing the size of the input split's data through the RawSplit. The sizes of each split are then totaled up and made available via JobInProgress.getInputSize().

      This is needed, among other reasons, so that the JobInProgress knows how much data it's being run on, which will help build smarter schedulers.

        Attachments

        1. addDataSize.patch
          3 kB
          Ariel Shemaiah Rabkin

          Issue Links

            Activity

              People

              • Assignee:
                asrabkin Ariel Shemaiah Rabkin
                Reporter:
                asrabkin Ariel Shemaiah Rabkin
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved: