Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3039

Make mapreduce use same version of avro as HBase

    Details

    • Hadoop Flags:
      Reviewed

      Description

      HBase depends on avro 1.5.3 whereas hadoop-common depends on 1.3.2.
      When building HBase on top of hadoop, this should be consistent.
      Moreover, this should be consistent between common, hdfs, and mapreduce.

      Contribs seem to have declared a dependency on avro but are not in fact depending on it.

      1. MAPREDUCE-3039-branch-0.22.patch
        12 kB
        Joep Rottinghuis
      2. MAPREDUCE-3039-branch-0.22-shv.patch
        12 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Joep Rottinghuis created issue -
          Joep Rottinghuis made changes -
          Field Original Value New Value
          Link This issue is part of HADOOP-7646 [ HADOOP-7646 ]
          Joep Rottinghuis made changes -
          Summary Make hadoop-common use same version of avro as HBase Make mapreduce use same version of avro as HBase
          Hide
          Joep Rottinghuis added a comment -

          Will need somebody else with Avro experience to look at this please.
          In the current patch I have not removed the wrapping of strings (CharSequence) in JobSubmittedEvent. The same occurs in about 71 other places.
          My concern is that when we insert a Utf8 wrapped string into a map, that when other places try to do a get with a string we'll fail to find the entry. String or other CharSequence will not equal any o.a.avro.util.Utf8 defines equals:
          if (!(o instanceof Utf8)) return false;

          In short I think I need to rip out all the Utf8 wrapping going on. Right?

          Show
          Joep Rottinghuis added a comment - Will need somebody else with Avro experience to look at this please. In the current patch I have not removed the wrapping of strings (CharSequence) in JobSubmittedEvent. The same occurs in about 71 other places. My concern is that when we insert a Utf8 wrapped string into a map, that when other places try to do a get with a string we'll fail to find the entry. String or other CharSequence will not equal any o.a.avro.util.Utf8 defines equals: if (!(o instanceof Utf8)) return false; In short I think I need to rip out all the Utf8 wrapping going on. Right?
          Joep Rottinghuis made changes -
          Attachment MAPREDUCE-3039-branch-0.22.patch [ 12495172 ]
          Joep Rottinghuis made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12495172/MAPREDUCE-3039-branch-0.22.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/796//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12495172/MAPREDUCE-3039-branch-0.22.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/796//console This message is automatically generated.
          Hide
          Joep Rottinghuis added a comment -

          I think I have the answer. When compiling this on my Ubuntu 11.04 desktop with JDK 1.6.0_26, Utf8 is allowed in place of Charsequence.
          However, our jenkins server running 1.6.0_16 just spat out a bunch of compilation errors:

          compile-mapred-classes:
          [jsp-compile] 11/09/19 18:20:40 WARN compiler.TldLocationsCache: Internal Error: File /WEB-INF/web.xml not found
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build.xml:380: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
          [javac] Compiling 524 source files to /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build/classes
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:87: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] result.name = new Utf8(name);
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:88: incompatible types
          [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup>
          [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup>
          [javac] result.groups = new GenericData.Array<JhCounterGroup>(0, GROUPS);
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:92: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] g.name = new Utf8(group.getName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:93: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] g.displayName = new Utf8(group.getDisplayName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:94: incompatible types
          [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounter>
          [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounter>
          [javac] g.counts = new GenericData.Array<JhCounter>(group.size(), COUNTERS);
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:97: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] c.name = new Utf8(counter.getName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:98: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] c.displayName = new Utf8(counter.getDisplayName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptFinishedEvent.java:56: incompatible types

          etc. etc.

          Will remove the UTf8 and submit new patch.

          Show
          Joep Rottinghuis added a comment - I think I have the answer. When compiling this on my Ubuntu 11.04 desktop with JDK 1.6.0_26, Utf8 is allowed in place of Charsequence. However, our jenkins server running 1.6.0_16 just spat out a bunch of compilation errors: compile-mapred-classes: [jsp-compile] 11/09/19 18:20:40 WARN compiler.TldLocationsCache: Internal Error: File /WEB-INF/web.xml not found [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build.xml:380: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds [javac] Compiling 524 source files to /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build/classes [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:87: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] result.name = new Utf8(name); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:88: incompatible types [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup> [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup> [javac] result.groups = new GenericData.Array<JhCounterGroup>(0, GROUPS); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:92: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] g.name = new Utf8(group.getName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:93: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] g.displayName = new Utf8(group.getDisplayName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:94: incompatible types [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounter> [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounter> [javac] g.counts = new GenericData.Array<JhCounter>(group.size(), COUNTERS); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:97: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] c.name = new Utf8(counter.getName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:98: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] c.displayName = new Utf8(counter.getDisplayName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptFinishedEvent.java:56: incompatible types etc. etc. Will remove the UTf8 and submit new patch.
          Hide
          Joep Rottinghuis added a comment -

          Ok, I'll take that back.
          When heeding my own advice in HADOOP-7646 about clearing out the ~/.ivy2/repository cache the avro 1.3.2 jar disappeared from the classpath and the compilation error went away.

          Back to my original question:
          Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent?
          My concern is that the map is defined as CharSequence and we're putting Utf8's in the map. If any client tries to retrieve a key using a string, then they will not get the value back as the key of type Utf8 will not equal String or Charsequence types.

          Show
          Joep Rottinghuis added a comment - Ok, I'll take that back. When heeding my own advice in HADOOP-7646 about clearing out the ~/.ivy2/repository cache the avro 1.3.2 jar disappeared from the classpath and the compilation error went away. Back to my original question: Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent? My concern is that the map is defined as CharSequence and we're putting Utf8's in the map. If any client tries to retrieve a key using a string, then they will not get the value back as the key of type Utf8 will not equal String or Charsequence types.
          Hide
          Doug Cutting added a comment -

          > Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent?

          No. The jobAcls field is only accessed by JobSubmittedEvent. That class is consistent about using Utf8 to access the Map, so results should be correct. Intermixing String and Utf8 as keys in a Map<CharSequence,?> would likely cause problems. (This is a known issue with Avro: AVRO-803.)

          I think the patch looks good as it is. +1

          Show
          Doug Cutting added a comment - > Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent? No. The jobAcls field is only accessed by JobSubmittedEvent. That class is consistent about using Utf8 to access the Map, so results should be correct. Intermixing String and Utf8 as keys in a Map<CharSequence,?> would likely cause problems. (This is a known issue with Avro: AVRO-803 .) I think the patch looks good as it is. +1
          Hide
          Joep Rottinghuis added a comment -

          Thanks for clarifying and the reference to the existing issue Doug.
          That was indeed the case I had in mind.

          Show
          Joep Rottinghuis added a comment - Thanks for clarifying and the reference to the existing issue Doug. That was indeed the case I had in mind.
          Hide
          Konstantin Shvachko added a comment -

          A slight modification to avoid avro dependency on packaged jars in Hadoop.

          Show
          Konstantin Shvachko added a comment - A slight modification to avoid avro dependency on packaged jars in Hadoop.
          Konstantin Shvachko made changes -
          Attachment MAPREDUCE-3039-branch-0.22-shv.patch [ 12496835 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12496835/MAPREDUCE-3039-branch-0.22-shv.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/877//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12496835/MAPREDUCE-3039-branch-0.22-shv.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/877//console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this. Thank you Joep.

          Show
          Konstantin Shvachko added a comment - I just committed this. Thank you Joep.
          Konstantin Shvachko made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-22-branch #77 (See https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/77/)
          MAPREDUCE-3039. Upgrade to Avro 1.5.3. Contributed by Joep Rottinghuis.

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1176694
          Files :

          • /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/mapreduce/build.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/ivy/libraries.properties
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/capacity-scheduler/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/fairscheduler/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/gridmix/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mrunit/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mumak/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/raid/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/streaming/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
          • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java
          • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobSubmittedEvent.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #77 (See https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/77/ ) MAPREDUCE-3039 . Upgrade to Avro 1.5.3. Contributed by Joep Rottinghuis. shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1176694 Files : /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt /hadoop/common/branches/branch-0.22/mapreduce/build.xml /hadoop/common/branches/branch-0.22/mapreduce/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/ivy/libraries.properties /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/capacity-scheduler/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/fairscheduler/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/gridmix/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mrunit/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mumak/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/raid/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/streaming/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobSubmittedEvent.java
          Konstantin Shvachko made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          5h 1m 1 Joep Rottinghuis 20/Sep/11 01:30
          Patch Available Patch Available Resolved Resolved
          8d 2h 54m 1 Konstantin Shvachko 28/Sep/11 04:24
          Resolved Resolved Closed Closed
          75d 2h 54m 1 Konstantin Shvachko 12/Dec/11 06:19

            People

            • Assignee:
              Joep Rottinghuis
              Reporter:
              Joep Rottinghuis
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development