Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5873

Shuffle bandwidth computation includes time spent waiting for maps

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its definition however is confusing because it captures the time where there is no copying because there is a pause between when new wave of map outputs is available.
      current bw is definded as (bytes copied so far) / (total time in the copy phase so far)
      It would be more useful
      1) to measure bandwidth of a single copy call.
      2) display aggregated bw as long as there is at least one fetcher is in the copy call.

        Attachments

        1. MAPREDUCE-5873.v1.patch
          13 kB
          Siqi Li
        2. MAPREDUCE-5873.v2.patch
          15 kB
          Siqi Li
        3. MAPREDUCE-5873.v3.patch
          15 kB
          Siqi Li
        4. MAPREDUCE-5873.v4.patch
          18 kB
          Siqi Li
        5. MAPREDUCE-5873.v5.patch
          18 kB
          Siqi Li
        6. MAPREDUCE-5873.v6.patch
          18 kB
          Siqi Li
        7. MAPREDUCE-5873.v9.patch
          18 kB
          Siqi Li

          Activity

            People

            • Assignee:
              l201514 Siqi Li
              Reporter:
              l201514 Siqi Li
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: