Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3441

Pass the size of the MapReduce input to JobInProgress

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Won't Fix
    • 0.17.0
    • None
    • None
    • None
    • all

    Description

      Currently, there's no easy way for the JobInProgress to know how large the job's input data is.

      This patch corrects the problem, by storing the size of the input split's data through the RawSplit. The sizes of each split are then totaled up and made available via JobInProgress.getInputSize().

      This is needed, among other reasons, so that the JobInProgress knows how much data it's being run on, which will help build smarter schedulers.

      Attachments

        1. addDataSize.patch
          3 kB
          Ariel Shemaiah Rabkin

        Issue Links

          Activity

            People

              asrabkin Ariel Shemaiah Rabkin
              asrabkin Ariel Shemaiah Rabkin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: