Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4963

StatisticsCollector improperly keeps track of "Last Day" and "Last Hour" statistics for new TaskTrackers

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.1
    • Fix Version/s: 1.2.0
    • Component/s: mrv1
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The StatisticsCollector keeps track of updates to the "Total Tasks Last Day", "Succeed Tasks Last Day", "Total Tasks Last Hour", and "Succeeded Tasks Last Hour" per Task Tracker which is displayed on the JobTracker web UI. It uses buckets to manage when to shift task counts from "Last Hour" to "Last Day" and out of "Last Day". After the JT has been running for a while, the connected TTs will have the max number of buckets and will keep shifting them at each update. If a new TT connects (or an old on rejoins), it won't have the max number of buckets, but the code that drops the buckets uses the same counter for all sets of buckets. This means that new TTs will prematurely drop their buckets and the stats will be incorrect.

      example:

      1. Max buckets is 5
      2. TaskTracker A has these values in its buckets [4, 2, 0, 3, 10] (i.e. 19)
      3. A new TaskTracker, B, connects; it has nothing in its buckets: [ ] (i.e. 0)
      4. TaskTracker B runs 3 tasks and TaskTracker A runs 5
      5. An update occurs
      6. TaskTracker A has [2, 0, 3, 10, 5] (i.e. 20)
      7. TaskTracker B should have [3] but it will drop that bucket after adding it during the update and instead have [ ] again (i.e. 0)
      8. TaskTracker B will keep doing that forever and always show 0 in the web UI

      We can fix this by not using the same counter for all sets of buckets

        Issue Links

          Activity

          Hide
          Robert Kanter added a comment -

          The patch fixes the problem by keeping a separate counter for each set of buckets and checking the length of the buckets. I also added a test that does something similar to the above example.

          Show
          Robert Kanter added a comment - The patch fixes the problem by keeping a separate counter for each set of buckets and checking the length of the buckets. I also added a test that does something similar to the above example.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12566603/MAPREDUCE-4963.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3277//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566603/MAPREDUCE-4963.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3277//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          +1

          Show
          Alejandro Abdelnur added a comment - +1
          Hide
          Alejandro Abdelnur added a comment -

          Thanks Robert. Committed to branch-1.

          Show
          Alejandro Abdelnur added a comment - Thanks Robert. Committed to branch-1.
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop 1.2.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.

            People

            • Assignee:
              Robert Kanter
              Reporter:
              Robert Kanter
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development