Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2000

Rumen is not able to extract counters for Job history logs from Hadoop 0.20

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: tools/rumen
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Rumen tries to match the end of a value string through indexOf("\""). It does not take into account the case when an escaped '"' in the value string. This leads to the incorrect parsing the remaining key=value properties in the same line.

      1. mr-2000-20100806.patch
        4 kB
        Hong Tang
      2. mr-2000-20100809.patch
        7 kB
        Hong Tang
      3. mr-2000-20100810.patch
        7 kB
        Hong Tang
      4. mr-2000-20100810-yhadoop-20.1xx.patch
        7 kB
        Hong Tang

        Activity

        Hide
        Hong Tang added a comment -

        An example input line that triggers this bug:

        MapAttempt TASK_TYPE="MAP" TASKID="task_201005120512_181225_m_000266" TASK_ATTEMPT_ID="attempt_201005120512_181225_m_000266_0" 
        TASK_STATUS="SUCCESS" FINISH_TIME="1275354731626" STATE_STRING="qry:\"RFC 3584\" Value: [Simple][timestamp:1275276122][type:c][/Simple]" COUNTERS="
        {(FileSystemCounters)(FileSystemCounters)[(FILE_BYTES_READ)(FILE_BYTES_READ)(1609)][(HDFS_BYTES_READ)(HDFS_BYTES_READ)(67412713)]
        [(FILE_BYTES_WRITTEN)(FILE_BYTES_WRITTEN)(5648633)]}{(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(COMBINE_OUTPUT_RECORDS)
        (Combine output records)(0)][(MAP_INPUT_RECORDS)(Map input records)(92297)][(SPILLED_RECORDS)(Spilled Records)(74370)][(MAP_OUTPUT_BYTES)(Map output 
        bytes)(18737847)][(MAP_INPUT_BYTES)(Map input bytes)(67211804)][(COMBINE_INPUT_RECORDS)(Combine input records)(0)][(MAP_OUTPUT_RECORDS)(Map output 
        records)(74370)]}" .
        

        It outputs the following error messages:

        10/08/06 23:14:40 WARN rumen.HistoryEventEmitter: HistoryEventEmitters: null counter detected:
        10/08/06 23:14:40 WARN rumen.HistoryEventEmitter: HistoryEventEmitters: null counter detected:
        ...
        java.lang.StringIndexOutOfBoundsException: String index out of range: -1
                at java.lang.String.substring(String.java:1938)
                at org.apache.hadoop.tools.rumen.ParsedLine.<init>(ParsedLine.java:100)
                at org.apache.hadoop.tools.rumen.Hadoop20JHParser.nextEvent(Hadoop20JHParser.java:131)
                at org.apache.hadoop.tools.rumen.TraceBuilder.processJobHistory(TraceBuilder.java:287)
                at org.apache.hadoop.tools.rumen.TraceBuilder.run(TraceBuilder.java:242)
                at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
                at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
                at org.apache.hadoop.tools.rumen.TraceBuilder.main(TraceBuilder.java:120)
        
        Show
        Hong Tang added a comment - An example input line that triggers this bug: MapAttempt TASK_TYPE="MAP" TASKID="task_201005120512_181225_m_000266" TASK_ATTEMPT_ID="attempt_201005120512_181225_m_000266_0" TASK_STATUS="SUCCESS" FINISH_TIME="1275354731626" STATE_STRING="qry:\"RFC 3584\" Value: [Simple][timestamp:1275276122][type:c][/Simple]" COUNTERS=" {(FileSystemCounters)(FileSystemCounters)[(FILE_BYTES_READ)(FILE_BYTES_READ)(1609)][(HDFS_BYTES_READ)(HDFS_BYTES_READ)(67412713)] [(FILE_BYTES_WRITTEN)(FILE_BYTES_WRITTEN)(5648633)]}{(org\.apache\.hadoop\.mapred\.Task$Counter)(Map-Reduce Framework)[(COMBINE_OUTPUT_RECORDS) (Combine output records)(0)][(MAP_INPUT_RECORDS)(Map input records)(92297)][(SPILLED_RECORDS)(Spilled Records)(74370)][(MAP_OUTPUT_BYTES)(Map output bytes)(18737847)][(MAP_INPUT_BYTES)(Map input bytes)(67211804)][(COMBINE_INPUT_RECORDS)(Combine input records)(0)][(MAP_OUTPUT_RECORDS)(Map output records)(74370)]}" . It outputs the following error messages: 10/08/06 23:14:40 WARN rumen.HistoryEventEmitter: HistoryEventEmitters: null counter detected: 10/08/06 23:14:40 WARN rumen.HistoryEventEmitter: HistoryEventEmitters: null counter detected: ... java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1938) at org.apache.hadoop.tools.rumen.ParsedLine.<init>(ParsedLine.java:100) at org.apache.hadoop.tools.rumen.Hadoop20JHParser.nextEvent(Hadoop20JHParser.java:131) at org.apache.hadoop.tools.rumen.TraceBuilder.processJobHistory(TraceBuilder.java:287) at org.apache.hadoop.tools.rumen.TraceBuilder.run(TraceBuilder.java:242) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.rumen.TraceBuilder.main(TraceBuilder.java:120)
        Hide
        Hong Tang added a comment -

        Patch that fixed the bug. The added unit test fails without the patch.

        Show
        Hong Tang added a comment - Patch that fixed the bug. The added unit test fails without the patch.
        Hide
        Amar Kamat added a comment -

        Using the regex makes it much cleaner. +1 for using regex.

        Few comments.

        1. Can you please add some comments as to what the regex is supposed to do? Comments for each of the capturing groups w.r.t what are they planning to compare/match themselves against would be good enough.
        2. Can we reuse the regex declared in o.a.h.mapred.JobHistory? Seems similar to me.
        3. In the testcase, you could define your values in unescaped format and use StringUtils to escape it. This is how the framework does it. So here is how the testcase might look like
        line-type=// something
        key=k1
        value=val1 // special char content in unescaped format
        line=line-type + space + key + equals + quotes + StringUtils.escape(value) + quotes + line-delim
        ParsedLine pl = new ParsedLine(line, version)
        
        // assert
        newValue = pl.get(key)
        unEscapeValue = StringUtils.unescape(newValue)
        assertEquals(value, unEscapedValue)
        

        See a sample testcase here.

        Show
        Amar Kamat added a comment - Using the regex makes it much cleaner. +1 for using regex. Few comments. Can you please add some comments as to what the regex is supposed to do? Comments for each of the capturing groups w.r.t what are they planning to compare/match themselves against would be good enough. Can we reuse the regex declared in o.a.h.mapred.JobHistory? Seems similar to me. In the testcase, you could define your values in unescaped format and use StringUtils to escape it. This is how the framework does it. So here is how the testcase might look like line-type= // something key=k1 value=val1 // special char content in unescaped format line=line-type + space + key + equals + quotes + StringUtils.escape(value) + quotes + line-delim ParsedLine pl = new ParsedLine(line, version) // assert newValue = pl.get(key) unEscapeValue = StringUtils.unescape(newValue) assertEquals(value, unEscapedValue) See a sample testcase here .
        Hide
        Hong Tang added a comment -

        I uploaded a new patch that addresses Amar's comments.

        1. Can you please add some comments as to what the regex is supposed to do? Comments for each of the capturing groups w.r.t what are they planning to compare/match themselves against would be good enough.

        Comments added.

        2. Can we reuse the regex declared in o.a.h.mapred.JobHistory? Seems similar to me.

        Yes, they are based on the regex from yhadoop 20. However, this is not available in trunk. So I have to copy it over.

        3. In the testcase, you could define your values in unescaped format and use StringUtils to escape it.

        Used StringUtils's escapeString and unescapeString directly. I also added a few more unit tests that should be covering all cases as included in the testcase you wrote.

        Show
        Hong Tang added a comment - I uploaded a new patch that addresses Amar's comments. 1. Can you please add some comments as to what the regex is supposed to do? Comments for each of the capturing groups w.r.t what are they planning to compare/match themselves against would be good enough. Comments added. 2. Can we reuse the regex declared in o.a.h.mapred.JobHistory? Seems similar to me. Yes, they are based on the regex from yhadoop 20. However, this is not available in trunk. So I have to copy it over. 3. In the testcase, you could define your values in unescaped format and use StringUtils to escape it. Used StringUtils's escapeString and unescapeString directly. I also added a few more unit tests that should be covering all cases as included in the testcase you wrote.
        Hide
        Hong Tang added a comment -

        test-patch passed on my local machine.

             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 2 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        
        Show
        Hong Tang added a comment - test-patch passed on my local machine. [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 2 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        Hide
        Ranjit Mathew added a comment -

        I like new version of the patch, especially the way you explain the regex.

        Just a few minor comments:

        1. There's redundant "return" statement introduced by your patch.

        2. I would personally prefer a new-line before the "/**" lines to improve readability a bit.

        3. Do you think it makes sense to include the actual input-line above that triggered the bug in the unit test-case?

        Show
        Ranjit Mathew added a comment - I like new version of the patch, especially the way you explain the regex. Just a few minor comments: 1. There's redundant "return" statement introduced by your patch. 2. I would personally prefer a new-line before the "/**" lines to improve readability a bit. 3. Do you think it makes sense to include the actual input-line above that triggered the bug in the unit test-case?
        Hide
        Hong Tang added a comment -

        1. There's redundant "return" statement introduced by your patch.

        Will do.

        2. I would personally prefer a new-line before the "/**" lines to improve readability a bit.

        This is probably a matter of different style. I prefer to use blank lines to separate logically different groups.

        3. Do you think it makes sense to include the actual input-line above that triggered the bug in the unit test-case?

        We should verify the patch to make sure it fixes the real bug. But I think that unit tests should test minimally what it is supposed to test. In other words, I think including actual input that causes the bug is a bad idea. It would simply obscure what the unit test is supposed to cover.

        Show
        Hong Tang added a comment - 1. There's redundant "return" statement introduced by your patch. Will do. 2. I would personally prefer a new-line before the "/**" lines to improve readability a bit. This is probably a matter of different style. I prefer to use blank lines to separate logically different groups. 3. Do you think it makes sense to include the actual input-line above that triggered the bug in the unit test-case? We should verify the patch to make sure it fixes the real bug. But I think that unit tests should test minimally what it is supposed to test. In other words, I think including actual input that causes the bug is a bad idea. It would simply obscure what the unit test is supposed to cover.
        Hide
        Hong Tang added a comment -

        New patch that addresses Ranjit's comments.

        Show
        Hong Tang added a comment - New patch that addresses Ranjit's comments.
        Hide
        Ranjit Mathew added a comment -

        +1 from me. Thanks for taking this up.

        Show
        Ranjit Mathew added a comment - +1 from me. Thanks for taking this up.
        Hide
        Chris Douglas added a comment -

        +1

        I committed this. Thanks, Hong!

        Show
        Chris Douglas added a comment - +1 I committed this. Thanks, Hong!
        Hide
        Hong Tang added a comment -

        Patch for yahoo hadoop 20.1xx branch. Not to be committed.

        Show
        Hong Tang added a comment - Patch for yahoo hadoop 20.1xx branch. Not to be committed.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/ )

          People

          • Assignee:
            Hong Tang
            Reporter:
            Hong Tang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development