Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5873

Shuffle bandwidth computation includes time spent waiting for maps

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.6.0
    • None
    • None
    • Reviewed

    Description

      Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its definition however is confusing because it captures the time where there is no copying because there is a pause between when new wave of map outputs is available.
      current bw is definded as (bytes copied so far) / (total time in the copy phase so far)
      It would be more useful
      1) to measure bandwidth of a single copy call.
      2) display aggregated bw as long as there is at least one fetcher is in the copy call.

      Attachments

        1. MAPREDUCE-5873.v1.patch
          13 kB
          Siqi Li
        2. MAPREDUCE-5873.v2.patch
          15 kB
          Siqi Li
        3. MAPREDUCE-5873.v3.patch
          15 kB
          Siqi Li
        4. MAPREDUCE-5873.v4.patch
          18 kB
          Siqi Li
        5. MAPREDUCE-5873.v5.patch
          18 kB
          Siqi Li
        6. MAPREDUCE-5873.v6.patch
          18 kB
          Siqi Li
        7. MAPREDUCE-5873.v9.patch
          18 kB
          Siqi Li

        Activity

          People

            l201514 Siqi Li
            l201514 Siqi Li
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: