Hadoop Common
  1. Hadoop Common
  2. HADOOP-5582

Hadoop Vaidya throws number format exception due to changes in the job history counters string format (escaped compact representation).

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixed error parsing job history counters after change of counter format.

      Description

      Hadoop Vaidya (contrib/vaidya) tool throws number format exception while parsing the job history log files due to change in the format of counters string in 0.20.

      1. vaidya-0.21.0-5582-5764.patch
        26 kB
        Suhas Gogate
      2. vaidya_patch_21.patch
        22 kB
        Suhas Gogate

        Issue Links

          Activity

          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21. Bug.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21. Bug.
          Hide
          Devaraj Das added a comment -

          I just committed this. Thanks, Suhas!

          Show
          Devaraj Das added a comment - I just committed this. Thanks, Suhas!
          Hide
          Amar Kamat added a comment -

          Looks fine. +1.

          Show
          Amar Kamat added a comment - Looks fine. +1.
          Hide
          Suhas Gogate added a comment -

          Amar, hope this clarifies your questions. Let me know if you need any further clarification. If not, then can you pl. commit the fix. Thanks & Regards, Suhas

          Show
          Suhas Gogate added a comment - Amar, hope this clarifies your questions. Let me know if you need any further clarification. If not, then can you pl. commit the fix. Thanks & Regards, Suhas
          Hide
          Suhas Gogate added a comment -

          1. About converting to compact string and then using existing code is definitely a less change, but may not be a right way to go. What if tomorrow the compact format gets deprecated? Also as you mentioned we don't need to do extra conversion just to use existing code.

          2. Vaidya is a offline tool and uses the default job history parser to read and parse the job history file. Although it presents all the counters as a key/value pairs in the format or interface exposed by Vaidya tool for its rule writers. Thus it makes rules agnostic to any changes in hadoop counters and their representation.

          Show
          Suhas Gogate added a comment - 1. About converting to compact string and then using existing code is definitely a less change, but may not be a right way to go. What if tomorrow the compact format gets deprecated? Also as you mentioned we don't need to do extra conversion just to use existing code. 2. Vaidya is a offline tool and uses the default job history parser to read and parse the job history file. Although it presents all the counters as a key/value pairs in the format or interface exposed by Vaidya tool for its rule writers. Thus it makes rules agnostic to any changes in hadoop counters and their representation.
          Hide
          Amar Kamat added a comment -

          From what I understand, you could something like

          Counters c = Counters.fromEscapedCompactString(counterString);
          String compactString = c.toCompactStrihng();
          // do what you do currently .. but with compactString and not counterString
          Matcher m = _pattern.matcher(counters);
          while(m.find()){
            String ctuple = m.group(0);
            //String ctuple = c1tuple.substring(0, c1tuple.length()-1);
          ....
          ...
          .
          

          This might not be efficient but will will reduce code change and testing time. So basically what you will be doing here is you will convert the new format to old format and pass to the code.


          Few questions :

          1. Seems like there is a parser written here for job history. Why cant you use the existing job history parser?
          2. Seems like counters are maintained as key val pair. Why can a global counter be used and all new counter can simply be added to it?

          I have not seen the whole code. These questions are based on the impression that I am getting looking at the patch.

          Show
          Amar Kamat added a comment - From what I understand, you could something like Counters c = Counters.fromEscapedCompactString(counterString); String compactString = c.toCompactStrihng(); // do what you do currently .. but with compactString and not counterString Matcher m = _pattern.matcher(counters); while (m.find()){ String ctuple = m.group(0); // String ctuple = c1tuple.substring(0, c1tuple.length()-1); .... ... . This might not be efficient but will will reduce code change and testing time. So basically what you will be doing here is you will convert the new format to old format and pass to the code. Few questions : Seems like there is a parser written here for job history. Why cant you use the existing job history parser? Seems like counters are maintained as key val pair. Why can a global counter be used and all new counter can simply be added to it? I have not seen the whole code. These questions are based on the impression that I am getting looking at the patch.
          Hide
          Amar Kamat added a comment -

          Checked the code change related to Counters. Looks fine to me and seems like a right direction (i.e counter string obtained from jobhistory should be decoded using Counters).
          +1.

          Show
          Amar Kamat added a comment - Checked the code change related to Counters . Looks fine to me and seems like a right direction (i.e counter string obtained from jobhistory should be decoded using Counters ). +1.
          Hide
          Suhas Gogate added a comment -

          Can someone pl. review and commit this patch. I need to add some more diagnostic rules which require this patch.

          Show
          Suhas Gogate added a comment - Can someone pl. review and commit this patch. I need to add some more diagnostic rules which require this patch.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12407120/vaidya-0.21.0-5582-5764.patch
          against trunk revision 770685.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407120/vaidya-0.21.0-5582-5764.patch against trunk revision 770685. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/283/console This message is automatically generated.
          Hide
          Suhas Gogate added a comment -

          Submitting a new patch. See comment above.

          Show
          Suhas Gogate added a comment - Submitting a new patch. See comment above.
          Hide
          Suhas Gogate added a comment -

          Submitting new patch to also incorporate the fix for HADOOP-5764, which depends on previous patch submitted for this issue. After committing this patch issue in HADOOP-5764 would also be resolved.

          Show
          Suhas Gogate added a comment - Submitting new patch to also incorporate the fix for HADOOP-5764 , which depends on previous patch submitted for this issue. After committing this patch issue in HADOOP-5764 would also be resolved.
          Hide
          Suhas Gogate added a comment -

          Adding new patch to also incorporate fix for HADOOP-5764, which depends on this one.

          Show
          Suhas Gogate added a comment - Adding new patch to also incorporate fix for HADOOP-5764 , which depends on this one.
          Hide
          Devaraj Das added a comment -

          Amar, can you review this please?

          Show
          Devaraj Das added a comment - Amar, can you review this please?
          Hide
          Suhas Gogate added a comment -

          Can someone pl. commit this patch?

          Show
          Suhas Gogate added a comment - Can someone pl. commit this patch?
          Hide
          Suhas Gogate added a comment -

          does not have a formal automated tests suite available for it yet.

          Show
          Suhas Gogate added a comment - does not have a formal automated tests suite available for it yet.
          Hide
          Suhas Gogate added a comment -

          1. This is a standalone contrib tool and does not have a formal automated tests suite build for it yet.

          2. Failed tests are unrelated.

          So this patch should go in.

          Show
          Suhas Gogate added a comment - 1. This is a standalone contrib tool and does not have a formal automated tests suite build for it yet. 2. Failed tests are unrelated. So this patch should go in.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12404195/vaidya_patch_21.patch
          against trunk revision 760376.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404195/vaidya_patch_21.patch against trunk revision 760376. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/82/console This message is automatically generated.
          Hide
          Suhas Gogate added a comment -

          This patch is applicable for both hadoop 0.20.0 and trunk (0.21.0).

          Show
          Suhas Gogate added a comment - This patch is applicable for both hadoop 0.20.0 and trunk (0.21.0).
          Hide
          Suhas Gogate added a comment -

          This is a patch for both hadoop-0.20.0 and trunk (0.21.0)

          Show
          Suhas Gogate added a comment - This is a patch for both hadoop-0.20.0 and trunk (0.21.0)
          Hide
          Suhas Gogate added a comment -

          contrib/vaidya tool to use new org.apache.hadoop.mapreduce.Counters class, it needs to have fromEscapedCompactString() method to parse the job history counters string.

          Show
          Suhas Gogate added a comment - contrib/vaidya tool to use new org.apache.hadoop.mapreduce.Counters class, it needs to have fromEscapedCompactString() method to parse the job history counters string.

            People

            • Assignee:
              Suhas Gogate
              Reporter:
              Suhas Gogate
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development