Hadoop Common
  1. Hadoop Common
  2. HADOOP-5738

Split waiting tasks field in JobTracker metrics to individual tasks

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: metrics
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed

      Description

      Currently, job tracker metrics reports waiting tasks as a single field in metrics. It would be better if we can split waiting tasks into maps and reduces.

      1. 5738-y20.patch
        6 kB
        Robert Chansler
      2. HADOOP-5738-1.patch
        5 kB
        Sreekanth Ramakrishnan
      3. HADOOP-5738-2.1.patch
        5 kB
        Chris Douglas
      4. HADOOP-5738-2.patch
        5 kB
        Sreekanth Ramakrishnan
      5. HADOOP-5738-3.patch
        5 kB
        Sreekanth Ramakrishnan

        Activity

        Hide
        Robert Chansler added a comment -

        Attached example for v20 not to be committed.

        Show
        Robert Chansler added a comment - Attached example for v20 not to be committed.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk #827 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/827/)
        . Split "waiting_tasks" JobTracker metric into waiting maps and waiting reduces. Contributed by Sreekanth Ramakrishnan.

        Show
        Hudson added a comment - Integrated in Hadoop-trunk #827 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/827/ ) . Split "waiting_tasks" JobTracker metric into waiting maps and waiting reduces. Contributed by Sreekanth Ramakrishnan.
        Hide
        Chris Douglas added a comment -

        I committed this. Thanks, Sreekanth

        Show
        Chris Douglas added a comment - I committed this. Thanks, Sreekanth
        Hide
        Chris Douglas added a comment -

        Sorry, my comments were unclear. I mean only that v2 should be committed to 0.21 (with the typo fix) and anyone interested in the 0.20 patch- with or without waiting_tasks- can apply this patch.

        Attaching v2 w/ the typo correction

        Show
        Chris Douglas added a comment - Sorry, my comments were unclear. I mean only that v2 should be committed to 0.21 (with the typo fix) and anyone interested in the 0.20 patch- with or without waiting_tasks- can apply this patch. Attaching v2 w/ the typo correction
        Hide
        Sreekanth Ramakrishnan added a comment -

        Attaching patch bringing back "waiting_tasks" field. Also renamed the method addWaitingMaps

        Show
        Sreekanth Ramakrishnan added a comment - Attaching patch bringing back "waiting_tasks" field. Also renamed the method addWaitingMaps
        Hide
        Hong Tang added a comment -

        @Sreekanth, sorry for my distracting comments. I guess Chris is right and we should keep "waiting_tasks" for now, and possibly remove it in future major releases.

        Show
        Hong Tang added a comment - @Sreekanth, sorry for my distracting comments. I guess Chris is right and we should keep "waiting_tasks" for now, and possibly remove it in future major releases.
        Hide
        Chris Douglas added a comment -

        +1 on the changes, though the new JobInstrumentation method should be named addWaitingMap*s* for symmetry with the other methods

        Also, to ensure backward compatibility later, maybe we should port this patch to 0.20 too?

        Per the taxonomy in HADOOP-5073, the instrumentation metrics are called out as a public-stable API, and are supposed to remain unchanged except in major versions (for this patch pre-1.0, this means 0.21). If this could be applied to 0.20 at all, Sreekanth's original fix retaining the waiting_tasks metric would be the correct one by that reasoning.

        Since it applies cleanly to 0.20, I'd lean towards putting it in 0.21 and let interested parties apply it themselves.

        Show
        Chris Douglas added a comment - +1 on the changes, though the new JobInstrumentation method should be named addWaitingMap*s* for symmetry with the other methods Also, to ensure backward compatibility later, maybe we should port this patch to 0.20 too? Per the taxonomy in HADOOP-5073 , the instrumentation metrics are called out as a public-stable API, and are supposed to remain unchanged except in major versions (for this patch pre-1.0, this means 0.21). If this could be applied to 0.20 at all, Sreekanth's original fix retaining the waiting_tasks metric would be the correct one by that reasoning. Since it applies cleanly to 0.20, I'd lean towards putting it in 0.21 and let interested parties apply it themselves.
        Hide
        Sreekanth Ramakrishnan added a comment -

        Attaching the patch removing the "waiting_tasks" column. The patch applies to branch 20 also, but creates .orig files.

        Show
        Sreekanth Ramakrishnan added a comment - Attaching the patch removing the "waiting_tasks" column. The patch applies to branch 20 also, but creates .orig files.
        Hide
        Hong Tang added a comment -

        Also, to ensure backward compatibility later, maybe we should port this patch to 0.20 too?

        Show
        Hong Tang added a comment - Also, to ensure backward compatibility later, maybe we should port this patch to 0.20 too?
        Hide
        Hong Tang added a comment -

        May I suggest that we drop waiting_tasks? It has only been committed to 0.20 in March.

        Show
        Hong Tang added a comment - May I suggest that we drop waiting_tasks? It has only been committed to 0.20 in March.
        Hide
        Sreekanth Ramakrishnan added a comment -

        The waiting_tasks were kept for compatibility reasons. Should we just remove it or is there any way we can deprecate the metrics field?

        Show
        Sreekanth Ramakrishnan added a comment - The waiting_tasks were kept for compatibility reasons. Should we just remove it or is there any way we can deprecate the metrics field?
        Hide
        Chris Douglas added a comment -

        Is there a particular reason for keeping waiting_tasks?

        Show
        Chris Douglas added a comment - Is there a particular reason for keeping waiting_tasks?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12406343/HADOOP-5738-1.patch
        against trunk revision 769339.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12406343/HADOOP-5738-1.patch against trunk revision 769339. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/256/console This message is automatically generated.
        Hide
        Hong Tang added a comment -

        Yes, that would be useful to have pending map tasks and reduce tasks separately.

        Show
        Hong Tang added a comment - Yes, that would be useful to have pending map tasks and reduce tasks separately.
        Hide
        Sreekanth Ramakrishnan added a comment -

        Attaching a patch to address this issue.

        • The patch retains current "waiting_task" field.
        • Adds two new fields "waiting_maps" and "waiting_reduces". Waiting task is sum of these two fields.
        Show
        Sreekanth Ramakrishnan added a comment - Attaching a patch to address this issue. The patch retains current "waiting_task" field. Adds two new fields "waiting_maps" and "waiting_reduces". Waiting task is sum of these two fields.
        Hide
        Sreekanth Ramakrishnan added a comment -

        By splitting the waiting tasks counts and HADOOP-5733 the utilization of task trackers can be easily computed.

        Show
        Sreekanth Ramakrishnan added a comment - By splitting the waiting tasks counts and HADOOP-5733 the utilization of task trackers can be easily computed.

          People

          • Assignee:
            Sreekanth Ramakrishnan
            Reporter:
            Sreekanth Ramakrishnan
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development