Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-980

Modify JobHistory to use Avro for serialization instead of raw JSON

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None

      Description

      MAPREDUCE-157 modifies JobHistory to log events using Json Format. This can be modified to use Avro instead.

      1. MAPREDUCE-980.patch
        114 kB
        Doug Cutting
      2. MAPREDUCE-980.patch
        114 kB
        Doug Cutting
      3. MAPREDUCE-980.patch
        114 kB
        Doug Cutting
      4. MAPREDUCE-980.patch
        111 kB
        Doug Cutting
      5. MAPREDUCE-980.patch
        111 kB
        Doug Cutting
      6. MAPREDUCE-980.patch
        111 kB
        Doug Cutting
      7. MAPREDUCE-980.patch
        116 kB
        Doug Cutting
      8. MAPREDUCE-980.patch
        118 kB
        Doug Cutting
      9. MAPREDUCE-980.patch
        118 kB
        Doug Cutting
      10. MAPREDUCE-980.patch
        118 kB
        Doug Cutting

        Issue Links

          Activity

          Hide
          Jothi Padmanabhan added a comment -

          As per this comment , I created this new Jira to follow the port to use Avro

          Show
          Jothi Padmanabhan added a comment - As per this comment , I created this new Jira to follow the port to use Avro
          Hide
          Doug Cutting added a comment -

          Here's a first pass at this. It must be applied after the MAPREDUCE-157 patch. It compiles but has not yet been tested.

          Show
          Doug Cutting added a comment - Here's a first pass at this. It must be applied after the MAPREDUCE-157 patch. It compiles but has not yet been tested.
          Hide
          Doug Cutting added a comment -

          Here's a version that passes tests.

          I will mark this as "Patch Available" as soon as MAPREDUCE-157 is committed so that Hudson can have a look at it.

          Show
          Doug Cutting added a comment - Here's a version that passes tests. I will mark this as "Patch Available" as soon as MAPREDUCE-157 is committed so that Hudson can have a look at it.
          Hide
          Doug Cutting added a comment -

          Restore some dropped javadoc and muffed visibility.

          Show
          Doug Cutting added a comment - Restore some dropped javadoc and muffed visibility.
          Hide
          Sharad Agarwal added a comment -

          It would have been ideal if we could directly use Avro generated classes without wrapping them. Wrapper classes are not very maintainable because we need to modify at two places- schema definition and wrapper class. Any reason why we can't directly use generated classes ? From what I can think of - this is done because we want all event classes to have a base interface, constructor and field getters. Having constructor and getters should be straight forward in code generator. For base class/interface, I think Avro can generate code from a template. SpecificRecordBase methods can directly go into the generated class (to work around multiple inheritance in java). Users can define the base class, interfaces or additional methods in the template which can be used to generate Avro specific class. I understand that this may not be doable at this point but something worth considering at some point to make Avro code generation feature more compelling.

          Show
          Sharad Agarwal added a comment - It would have been ideal if we could directly use Avro generated classes without wrapping them. Wrapper classes are not very maintainable because we need to modify at two places- schema definition and wrapper class. Any reason why we can't directly use generated classes ? From what I can think of - this is done because we want all event classes to have a base interface, constructor and field getters. Having constructor and getters should be straight forward in code generator. For base class/interface, I think Avro can generate code from a template. SpecificRecordBase methods can directly go into the generated class (to work around multiple inheritance in java). Users can define the base class, interfaces or additional methods in the template which can be used to generate Avro specific class. I understand that this may not be doable at this point but something worth considering at some point to make Avro code generation feature more compelling.
          Hide
          Doug Cutting added a comment -

          Proper 'svn diff' version of patch now that MAPREDUCE-157 is committed.

          Show
          Doug Cutting added a comment - Proper 'svn diff' version of patch now that MAPREDUCE-157 is committed.
          Hide
          Doug Cutting added a comment -

          > Any reason why we can't directly use generated classes ?

          You've already cited the biggest reason: the generated classes don't provide constructors or accessors. Long-term, we could enhance Avro to generate these, but I'm not sure we'd want to directly use the generated classes even then.

          The wrappers provide considerable utility, including:

          • Javadoc comments. We could generate these perhaps from documentation in the schema.
          • Visibility: The wrappers only provide public getters, not setters. We could perhaps add that to the schema and/or generator.
          • Type conversion: In both the version included in MAPREDUCE-157 and this version there's a fair amount of field-specific type conversion. For example, we don't directly serialize JobID instances, but rather use JobID's toString() and forName() methods to convert these to and from strings for serialization. Similarly for counters, task ids, etc. Ideally all of these would be naturally serializeable using Avro, but, until they are, the wrappers make it easy to incorporate things like these.
          • Compatibility: If we update the schema then Avro will handle reading old data, but, without the wrappers, we'd be unable to provide a back-compatible API for accessing the old data. So if we remove a field from the schema, with the wrappers we're able to deprecate the accessor and implement it in terms of new/remaining fields so that applications don't have to be upgraded.

          So I'm not entirely convinced that using wrappers for stuff like this is a bad pattern long term.

          Show
          Doug Cutting added a comment - > Any reason why we can't directly use generated classes ? You've already cited the biggest reason: the generated classes don't provide constructors or accessors. Long-term, we could enhance Avro to generate these, but I'm not sure we'd want to directly use the generated classes even then. The wrappers provide considerable utility, including: Javadoc comments. We could generate these perhaps from documentation in the schema. Visibility: The wrappers only provide public getters, not setters. We could perhaps add that to the schema and/or generator. Type conversion: In both the version included in MAPREDUCE-157 and this version there's a fair amount of field-specific type conversion. For example, we don't directly serialize JobID instances, but rather use JobID's toString() and forName() methods to convert these to and from strings for serialization. Similarly for counters, task ids, etc. Ideally all of these would be naturally serializeable using Avro, but, until they are, the wrappers make it easy to incorporate things like these. Compatibility: If we update the schema then Avro will handle reading old data, but, without the wrappers, we'd be unable to provide a back-compatible API for accessing the old data. So if we remove a field from the schema, with the wrappers we're able to deprecate the accessor and implement it in terms of new/remaining fields so that applications don't have to be upgraded. So I'm not entirely convinced that using wrappers for stuff like this is a bad pattern long term.
          Hide
          Philip Zeyliger added a comment -

          My experience with generated objects (from a couple of years using protocol buffers) is that one ends up wrapping them often (preferably with composition).

          The generated class is responsible for serialization and deserialization, and the wrapper class is responsible for added logic. It's hard to make the generator do something reasonable for logic (or even inheritance) cross-language. Having a wrapper also allows you to have two ways to use something, in two different contexts, where you might want different surrounding logic. (So, if you had an Avro schema for an Event, the code that generates the Event might use one wrapper, and the code that consumes it might use the raw object, or have a different object.)

          Show
          Philip Zeyliger added a comment - My experience with generated objects (from a couple of years using protocol buffers) is that one ends up wrapping them often (preferably with composition). The generated class is responsible for serialization and deserialization, and the wrapper class is responsible for added logic. It's hard to make the generator do something reasonable for logic (or even inheritance) cross-language. Having a wrapper also allows you to have two ways to use something, in two different contexts, where you might want different surrounding logic. (So, if you had an Avro schema for an Event, the code that generates the Event might use one wrapper, and the code that consumes it might use the raw object, or have a different object.)
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12419913/MAPREDUCE-980.patch
          against trunk revision 816454.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419913/MAPREDUCE-980.patch against trunk revision 816454. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/101/console This message is automatically generated.
          Hide
          Doug Cutting added a comment -

          Fixed a possible null pointer exception.

          Show
          Doug Cutting added a comment - Fixed a possible null pointer exception.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12420023/MAPREDUCE-980.patch
          against trunk revision 816647.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The patch appears to cause tar ant target to fail.

          -1 findbugs. The patch appears to cause Findbugs to fail.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/52/testReport/
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/52/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/52/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420023/MAPREDUCE-980.patch against trunk revision 816647. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/52/testReport/ Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/52/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/52/console This message is automatically generated.
          Hide
          Doug Cutting added a comment -

          Found a bunch more Ivy references to Avro 1.0 that need to be updated to 1.1.

          Show
          Doug Cutting added a comment - Found a bunch more Ivy references to Avro 1.0 that need to be updated to 1.1.
          Hide
          Doug Cutting added a comment -

          Fixed all ivy.xml files to now refer to Avro 1.0 rather than 1.1. Avro, Jackson and Paranamer versions are now specified in library.properties, so that this should not occur again.

          'ant test-patch' reports:

               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no new tests are needed for this patch.
               [exec]                         Also please list what manual steps were performed to verify this patch.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          No new tests are required, as MAPREDUCE-157 supplied sufficient tests.

          Show
          Doug Cutting added a comment - Fixed all ivy.xml files to now refer to Avro 1.0 rather than 1.1. Avro, Jackson and Paranamer versions are now specified in library.properties, so that this should not occur again. 'ant test-patch' reports: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. No new tests are required, as MAPREDUCE-157 supplied sufficient tests.
          Hide
          Jothi Padmanabhan added a comment -

          MAPREDUCE-277 will conflict with this patch – I added two more methods to the JobSubmittedEvent in that patch. Depending on which goes first, the other will have to merge.

          Show
          Jothi Padmanabhan added a comment - MAPREDUCE-277 will conflict with this patch – I added two more methods to the JobSubmittedEvent in that patch. Depending on which goes first, the other will have to merge.
          Hide
          Jothi Padmanabhan added a comment -

          I added two more methods to the JobSubmittedEvent

          Sorry, I meant two more arguments to the JobSubmittedEvent constructor.

          Show
          Jothi Padmanabhan added a comment - I added two more methods to the JobSubmittedEvent Sorry, I meant two more arguments to the JobSubmittedEvent constructor.
          Hide
          Doug Cutting added a comment -

          > MAPREDUCE-277 will conflict with this patch

          I'm happy to do the merge regardless of which is committed first.

          Since MAPREDUCE-277 is a blocker, my preference would be to commit this issue first, then that one, since it is not subject to today's deadline.

          I still need a committer to +1 this.

          Show
          Doug Cutting added a comment - > MAPREDUCE-277 will conflict with this patch I'm happy to do the merge regardless of which is committed first. Since MAPREDUCE-277 is a blocker, my preference would be to commit this issue first, then that one, since it is not subject to today's deadline. I still need a committer to +1 this.
          Hide
          Jothi Padmanabhan added a comment -

          my preference would be to commit this issue first, then that one, since it is not subject to today's deadline.

          Sorry, just saw this. In the meanwhile, MAPREDUCE-277 got committed.

          Show
          Jothi Padmanabhan added a comment - my preference would be to commit this issue first, then that one, since it is not subject to today's deadline. Sorry, just saw this. In the meanwhile, MAPREDUCE-277 got committed.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12420034/MAPREDUCE-980.patch
          against trunk revision 816664.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420034/MAPREDUCE-980.patch against trunk revision 816664. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/53/console This message is automatically generated.
          Hide
          Doug Cutting added a comment -

          Cancelling to resolve conflicts created by HADOOP-277.

          Show
          Doug Cutting added a comment - Cancelling to resolve conflicts created by HADOOP-277 .
          Hide
          Doug Cutting added a comment -

          Merged with MAPREDUCE-277.

          Show
          Doug Cutting added a comment - Merged with MAPREDUCE-277 .
          Hide
          Doug Cutting added a comment -

          'ant test-patch' on current patch:

               [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no new tests are needed for this patch.
               [exec]                         Also please list what manual steps were performed to verify this patch.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          
          Show
          Doug Cutting added a comment - 'ant test-patch' on current patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          Hide
          Jothi Padmanabhan added a comment -

          One a quick glance, minor nit – Don't we need Javadocs for the newly introduced public constructors in Counter and CounterGroup?

          Show
          Jothi Padmanabhan added a comment - One a quick glance, minor nit – Don't we need Javadocs for the newly introduced public constructors in Counter and CounterGroup?
          Hide
          Doug Cutting added a comment -

          > Don't we need Javadocs for the newly introduced public constructors in Counter and CounterGroup?

          Added.

          Show
          Doug Cutting added a comment - > Don't we need Javadocs for the newly introduced public constructors in Counter and CounterGroup? Added.
          Hide
          Doug Cutting added a comment -

          Improved javadoc a bit.

          Show
          Doug Cutting added a comment - Improved javadoc a bit.
          Hide
          Jothi Padmanabhan added a comment -

          Just another clarification – since the patch is using avro1.1, should we change the encoder to json instead of binary so that tools that scrape the logs instead of using EventReaders be supported?

          Show
          Jothi Padmanabhan added a comment - Just another clarification – since the patch is using avro1.1, should we change the encoder to json instead of binary so that tools that scrape the logs instead of using EventReaders be supported?
          Hide
          Doug Cutting added a comment -

          > should we change the encoder to json [ ... ]

          That was my initial instinct too, but Eric and Owen both indicated to me that they preferred that we use binary.

          Eric's comment is:

          https://issues.apache.org/jira/browse/MAPREDUCE-157?focusedCommentId=12745279&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12745279

          Owen has indicated this in offline discussions. The idea is that one can easily use Avro to dump the binary as JSON, but that the binary is smaller and faster.

          It's a trivial change to make if we prefer JSON instead of binary.

          Show
          Doug Cutting added a comment - > should we change the encoder to json [ ... ] That was my initial instinct too, but Eric and Owen both indicated to me that they preferred that we use binary. Eric's comment is: https://issues.apache.org/jira/browse/MAPREDUCE-157?focusedCommentId=12745279&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12745279 Owen has indicated this in offline discussions. The idea is that one can easily use Avro to dump the binary as JSON, but that the binary is smaller and faster. It's a trivial change to make if we prefer JSON instead of binary.
          Hide
          Jothi Padmanabhan added a comment -

          I do not know if I am being naive here, but should the EventWriter.Version be something different than "Avro-Binary"? Something that will help us keep track of schema evolutions, like "1.0" ? Or is the version used for a different purpose?

          Show
          Jothi Padmanabhan added a comment - I do not know if I am being naive here, but should the EventWriter.Version be something different than "Avro-Binary"? Something that will help us keep track of schema evolutions, like "1.0" ? Or is the version used for a different purpose?
          Hide
          Doug Cutting added a comment -

          > should the EventWriter.Version be something different than "Avro-Binary"? Something that will help us keep track of schema evolutions

          The entire schema is included in the file. If the schema changes, Avro can still read old data. We don't need to update the file version if we, e.g., add a field. If we make such a fundamental change to the schema that Avro's automatic versioning cannot handle it, then we could change the version string to be "Avro-Binary-v2" or something. Or we could examine the schema itself to determine which version it is.

          Show
          Doug Cutting added a comment - > should the EventWriter.Version be something different than "Avro-Binary"? Something that will help us keep track of schema evolutions The entire schema is included in the file. If the schema changes, Avro can still read old data. We don't need to update the file version if we, e.g., add a field. If we make such a fundamental change to the schema that Avro's automatic versioning cannot handle it, then we could change the version string to be "Avro-Binary-v2" or something. Or we could examine the schema itself to determine which version it is.
          Hide
          Jothi Padmanabhan added a comment -

          Got it. Thanks.

          Show
          Jothi Padmanabhan added a comment - Got it. Thanks.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12420061/MAPREDUCE-980.patch
          against trunk revision 816735.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420061/MAPREDUCE-980.patch against trunk revision 816735. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/54/console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          This looks like a good change. I love it when we get to rip out code.

          +1

          Show
          Owen O'Malley added a comment - This looks like a good change. I love it when we get to rip out code. +1
          Hide
          Doug Cutting added a comment -

          I just committed this.

          Show
          Doug Cutting added a comment - I just committed this.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12420063/MAPREDUCE-980.patch
          against trunk revision 816782.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420063/MAPREDUCE-980.patch against trunk revision 816782. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/116/console This message is automatically generated.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/)
          . Modify JobHistory to use Avro for serialization.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #58 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/58/ ) . Modify JobHistory to use Avro for serialization.

            People

            • Assignee:
              Doug Cutting
              Reporter:
              Jothi Padmanabhan
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development