Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5309

2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.4-alpha
    • Fix Version/s: 2.5.0
    • Component/s: jobhistoryserver, mrv2
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When the 2.0.4 JobHistoryParser tries to parse a job history file generated by hadoop 2.0.3, the jobhistoryparser throws as an error as

      java.lang.ClassCastException: org.apache.avro.generic.GenericData$Array cannot be cast to org.apache.hadoop.mapreduce.jobhistory.JhCounters
      at org.apache.hadoop.mapreduce.jobhistory.TaskAttemptUnsuccessfulCompletion.put(TaskAttemptUnsuccessfulCompletion.java:58)
      at org.apache.avro.generic.GenericData.setField(GenericData.java:463)
      at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
      at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
      at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
      at org.apache.hadoop.mapreduce.jobhistory.EventReader.getNextEvent(EventReader.java:93)
      at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:111)
      at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:156)
      at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:142)
      at com.twitter.somepackage.Test20JobHistoryParsing.testFileAvro(Test20JobHistoryParsing.java:23)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
      at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
      at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
      at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
      at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
      at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
      at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
      at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
      at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
      at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
      at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
      at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
      at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
      at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
      at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
      at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

      Test code and the job history file are attached.

      Test code:
      package com.twitter.somepackagel;

      import java.io.IOException;
      import org.apache.hadoop.conf.Configuration;
      import org.apache.hadoop.fs.FileSystem;
      import org.apache.hadoop.fs.Path;
      import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser;
      import org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.JobInfo;
      import org.junit.Test;
      import org.apache.hadoop.yarn.YarnException;

      public class Test20JobHistoryParsing {

      @Test
      public void testFileAvro() throws IOException
      {
      Path local_path2 = new Path("/tmp/job_2_0_3-KILLED.jhist");
      JobHistoryParser parser2 = new JobHistoryParser(FileSystem.getLocal(new Configuration()), local_path2);
      try

      { JobInfo ji2 = parser2.parse(); System.out.println(" job info: " + ji2.getJobname() + " " + ji2.getFinishedMaps() + " " + ji2.getTotalMaps() + " " + ji2.getJobId() ) ; }

      catch (IOException e)

      { throw new YarnException("Could not load history file " + local_path2.getName(), e); }

      }
      }

      This seems to stem from the fix in https://issues.apache.org/jira/browse/MAPREDUCE-4693
      that added counters to the historyserver for failed tasks.

      This breaks backward compatibility with JobHistoryServer.

      1. job_2_0_3-KILLED.jhist
        187 kB
        Vrushali C
      2. Test20JobHistoryParsing.java
        1.0 kB
        Vrushali C
      3. MAPREDUCE-5309.patch
        21 kB
        Rushabh S Shah
      4. MAPREDUCE-5309-v2.patch
        73 kB
        Rushabh S Shah
      5. MAPREDUCE-5309-v3.patch
        74 kB
        Rushabh S Shah
      6. MAPREDUCE-5309-v4.patch
        74 kB
        Rushabh S Shah
      7. MAPREDUCE-5309-v5.patch
        74 kB
        Rushabh S Shah

        Issue Links

          Activity

          Hide
          Viraj Bhat added a comment -

          This is an issue even when parsing Job History Logs generated in Hadoop 0.23.9.10
          Viraj

          Show
          Viraj Bhat added a comment - This is an issue even when parsing Job History Logs generated in Hadoop 0.23.9.10 Viraj
          Hide
          Rushabh S Shah added a comment -

          Changed the order of counters field in Events.avpr

          Show
          Rushabh S Shah added a comment - Changed the order of counters field in Events.avpr
          Hide
          Rushabh S Shah added a comment -

          This fix the history files that were generated before 2.4.0 but breaks the history files that are generated since 2.4.0.

          Show
          Rushabh S Shah added a comment - This fix the history files that were generated before 2.4.0 but breaks the history files that are generated since 2.4.0.
          Hide
          Viraj Bhat added a comment -

          Rushabh thanks for your help in fixing this Jira
          Viraj

          Show
          Viraj Bhat added a comment - Rushabh thanks for your help in fixing this Jira Viraj
          Hide
          Rushabh S Shah added a comment -

          Hey Viraj,
          This is not a stable fix. This will not parse the history files that are generated since 2.0.4.

          Show
          Rushabh S Shah added a comment - Hey Viraj, This is not a stable fix. This will not parse the history files that are generated since 2.0.4.
          Hide
          Rushabh S Shah added a comment -

          This patch will be parsing all the logs.

          Show
          Rushabh S Shah added a comment - This patch will be parsing all the logs.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12645258/MAPREDUCE-5309-v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs:

          org.apache.hadoop.mapreduce.v2.hs.webapp.dao.TestJobInfo
          org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryEntities

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4606//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4606//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4606//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645258/MAPREDUCE-5309-v2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 4 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: org.apache.hadoop.mapreduce.v2.hs.webapp.dao.TestJobInfo org.apache.hadoop.mapreduce.v2.hs.TestJobHistoryEntities +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4606//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4606//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4606//console This message is automatically generated.
          Hide
          Viraj Bhat added a comment -

          We are interested in Hadoop 2.4.x and Hadoop 0.23.x logs. Will this patch do it, I have not tested it as yet.
          Viraj

          Show
          Viraj Bhat added a comment - We are interested in Hadoop 2.4.x and Hadoop 0.23.x logs. Will this patch do it, I have not tested it as yet. Viraj
          Hide
          Rushabh S Shah added a comment -

          Broke a couple of test cases.

          Show
          Rushabh S Shah added a comment - Broke a couple of test cases.
          Hide
          Rushabh S Shah added a comment -

          Attaching a new patch correcting the previous test failures.

          Show
          Rushabh S Shah added a comment - Attaching a new patch correcting the previous test failures.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12645318/MAPREDUCE-5309-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4607//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4607//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4607//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645318/MAPREDUCE-5309-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 4 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4607//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4607//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4607//console This message is automatically generated.
          Hide
          Jason Lowe added a comment -

          Thanks for the patch, Rushabh! Some comments:

          • Findbugs warning needs to be addressed.
          • To match existing parsing behavior for missing fields strings should default to null rather than an empty string. The difference is subtle and could cause subtle bugs, so we should match existing behavior here. I think that means we need to have them be of type ["null", "string"] and "default": null.
          • If the task info or attempt info is null we should log a warning since something is wrong with the history file.
          • In the test cases we don't need to catch the IOException and translate to YarnException, we can just let the IOException bubble up and fail the test directly.
          • The test of the three separate jhist files should ideally be three separate unit tests. There's basically no common setup, and that way when a test fails the test name immediately conveys which version of the history is failing to parse.
          Show
          Jason Lowe added a comment - Thanks for the patch, Rushabh! Some comments: Findbugs warning needs to be addressed. To match existing parsing behavior for missing fields strings should default to null rather than an empty string. The difference is subtle and could cause subtle bugs, so we should match existing behavior here. I think that means we need to have them be of type ["null", "string"] and "default": null. If the task info or attempt info is null we should log a warning since something is wrong with the history file. In the test cases we don't need to catch the IOException and translate to YarnException, we can just let the IOException bubble up and fail the test directly. The test of the three separate jhist files should ideally be three separate unit tests. There's basically no common setup, and that way when a test fails the test name immediately conveys which version of the history is failing to parse.
          Hide
          Rushabh S Shah added a comment -

          Jason,
          Thanks for reviewing my patch.
          Submitting a new patch incorporating all of the comments.

          Show
          Rushabh S Shah added a comment - Jason, Thanks for reviewing my patch. Submitting a new patch incorporating all of the comments.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12645318/MAPREDUCE-5309-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4608//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4608//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4608//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645318/MAPREDUCE-5309-v3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 4 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4608//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4608//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4608//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12645344/MAPREDUCE-5309-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4610//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4610//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645344/MAPREDUCE-5309-v4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 4 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4610//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4610//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12645344/MAPREDUCE-5309-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4611//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4611//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645344/MAPREDUCE-5309-v4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 4 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4611//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4611//console This message is automatically generated.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Haven't looked very carefully, but scanned through the patch and it seems reasonable. Can you post a summary of what the patch does for posterity? Tx.

          Show
          Vinod Kumar Vavilapalli added a comment - Haven't looked very carefully, but scanned through the patch and it seems reasonable. Can you post a summary of what the patch does for posterity? Tx.
          Hide
          Rushabh S Shah added a comment -

          Current patch has a typo in one of the log statements.

          Show
          Rushabh S Shah added a comment - Current patch has a typo in one of the log statements.
          Hide
          Rushabh S Shah added a comment -

          Initially EventReader#reader was initialized like:
          this.reader = new SpecificDatumReader(schema, schema);
          This assumed the reader schema and writer schema is the same.
          But when the schema was upgraded from 2.0.3 to 2.0.4, new fields were added in 2.0.4 which were not present in 2.0.3. When the parser tried to parse 2.0.3 logs (which doesn't have the new fields), the parser returned with errors.
          So basically we need to differentiate between the new schema and the schema of the input jhist files and avro will do the rest of the mapping by field name.
          For the fields that were recently added, we need to assign the default values. So in case if we are parsing the old schema jhist files, it will assign the default value.
          Vinod Kumar Vavilapalli: I hope this helps.
          Viraj Bhat: Yes, this patch will parse both 0.23.x and 2.4.x logs.

          Show
          Rushabh S Shah added a comment - Initially EventReader#reader was initialized like: this.reader = new SpecificDatumReader(schema, schema); This assumed the reader schema and writer schema is the same. But when the schema was upgraded from 2.0.3 to 2.0.4, new fields were added in 2.0.4 which were not present in 2.0.3. When the parser tried to parse 2.0.3 logs (which doesn't have the new fields), the parser returned with errors. So basically we need to differentiate between the new schema and the schema of the input jhist files and avro will do the rest of the mapping by field name. For the fields that were recently added, we need to assign the default values. So in case if we are parsing the old schema jhist files, it will assign the default value. Vinod Kumar Vavilapalli : I hope this helps. Viraj Bhat : Yes, this patch will parse both 0.23.x and 2.4.x logs.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12645629/MAPREDUCE-5309-v5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4612//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4612//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645629/MAPREDUCE-5309-v5.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 4 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4612//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4612//console This message is automatically generated.
          Hide
          Jason Lowe added a comment -

          +1 lgtm. Committing this.

          Show
          Jason Lowe added a comment - +1 lgtm. Committing this.
          Hide
          Jason Lowe added a comment -

          Thanks, Rushabh! I committed this to trunk and branch-2.

          Show
          Jason Lowe added a comment - Thanks, Rushabh! I committed this to trunk and branch-2.
          Hide
          Rushabh S Shah added a comment -

          Thanks Jason for reviewing and committing the patch.

          Show
          Rushabh S Shah added a comment - Thanks Jason for reviewing and committing the patch.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #5607 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5607/)
          MAPREDUCE-5309. 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #5607 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5607/ ) MAPREDUCE-5309 . 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #563 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/563/)
          MAPREDUCE-5309. 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #563 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/563/ ) MAPREDUCE-5309 . 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1781 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1781/)
          MAPREDUCE-5309. 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1781 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1781/ ) MAPREDUCE-5309 . 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1755 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1755/)
          MAPREDUCE-5309. 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1755 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1755/ ) MAPREDUCE-5309 . 2.0.4 JobHistoryParser can't parse certain failed job history files generated by 2.0.3 history server. Contributed by Rushabh S Shah (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1596295 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/avro/Events.avpr /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestJobHistoryParsing.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_0.23.9-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.0.3-alpha-FAILED.jhist /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_2.4.0-FAILED.jhist

            People

            • Assignee:
              Rushabh S Shah
              Reporter:
              Vrushali C
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development