Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3039

Make mapreduce use same version of avro as HBase

    Details

    • Hadoop Flags:
      Reviewed

      Description

      HBase depends on avro 1.5.3 whereas hadoop-common depends on 1.3.2.
      When building HBase on top of hadoop, this should be consistent.
      Moreover, this should be consistent between common, hdfs, and mapreduce.

      Contribs seem to have declared a dependency on avro but are not in fact depending on it.

      1. MAPREDUCE-3039-branch-0.22-shv.patch
        12 kB
        Konstantin Shvachko
      2. MAPREDUCE-3039-branch-0.22.patch
        12 kB
        Joep Rottinghuis

        Issue Links

          Activity

          Hide
          Joep Rottinghuis added a comment -

          Will need somebody else with Avro experience to look at this please.
          In the current patch I have not removed the wrapping of strings (CharSequence) in JobSubmittedEvent. The same occurs in about 71 other places.
          My concern is that when we insert a Utf8 wrapped string into a map, that when other places try to do a get with a string we'll fail to find the entry. String or other CharSequence will not equal any o.a.avro.util.Utf8 defines equals:
          if (!(o instanceof Utf8)) return false;

          In short I think I need to rip out all the Utf8 wrapping going on. Right?

          Show
          Joep Rottinghuis added a comment - Will need somebody else with Avro experience to look at this please. In the current patch I have not removed the wrapping of strings (CharSequence) in JobSubmittedEvent. The same occurs in about 71 other places. My concern is that when we insert a Utf8 wrapped string into a map, that when other places try to do a get with a string we'll fail to find the entry. String or other CharSequence will not equal any o.a.avro.util.Utf8 defines equals: if (!(o instanceof Utf8)) return false; In short I think I need to rip out all the Utf8 wrapping going on. Right?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12495172/MAPREDUCE-3039-branch-0.22.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/796//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12495172/MAPREDUCE-3039-branch-0.22.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/796//console This message is automatically generated.
          Hide
          Joep Rottinghuis added a comment -

          I think I have the answer. When compiling this on my Ubuntu 11.04 desktop with JDK 1.6.0_26, Utf8 is allowed in place of Charsequence.
          However, our jenkins server running 1.6.0_16 just spat out a bunch of compilation errors:

          compile-mapred-classes:
          [jsp-compile] 11/09/19 18:20:40 WARN compiler.TldLocationsCache: Internal Error: File /WEB-INF/web.xml not found
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build.xml:380: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
          [javac] Compiling 524 source files to /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build/classes
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:87: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] result.name = new Utf8(name);
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:88: incompatible types
          [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup>
          [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup>
          [javac] result.groups = new GenericData.Array<JhCounterGroup>(0, GROUPS);
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:92: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] g.name = new Utf8(group.getName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:93: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] g.displayName = new Utf8(group.getDisplayName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:94: incompatible types
          [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounter>
          [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounter>
          [javac] g.counts = new GenericData.Array<JhCounter>(group.size(), COUNTERS);
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:97: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] c.name = new Utf8(counter.getName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:98: incompatible types
          [javac] found : org.apache.avro.util.Utf8
          [javac] required: java.lang.CharSequence
          [javac] c.displayName = new Utf8(counter.getDisplayName());
          [javac] ^
          [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptFinishedEvent.java:56: incompatible types

          etc. etc.

          Will remove the UTf8 and submit new patch.

          Show
          Joep Rottinghuis added a comment - I think I have the answer. When compiling this on my Ubuntu 11.04 desktop with JDK 1.6.0_26, Utf8 is allowed in place of Charsequence. However, our jenkins server running 1.6.0_16 just spat out a bunch of compilation errors: compile-mapred-classes: [jsp-compile] 11/09/19 18:20:40 WARN compiler.TldLocationsCache: Internal Error: File /WEB-INF/web.xml not found [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build.xml:380: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds [javac] Compiling 524 source files to /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/build/classes [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:87: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] result.name = new Utf8(name); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:88: incompatible types [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup> [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounterGroup> [javac] result.groups = new GenericData.Array<JhCounterGroup>(0, GROUPS); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:92: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] g.name = new Utf8(group.getName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:93: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] g.displayName = new Utf8(group.getDisplayName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:94: incompatible types [javac] found : org.apache.avro.generic.GenericData.Array<org.apache.hadoop.mapreduce.jobhistory.JhCounter> [javac] required: java.util.List<org.apache.hadoop.mapreduce.jobhistory.JhCounter> [javac] g.counts = new GenericData.Array<JhCounter>(group.size(), COUNTERS); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:97: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] c.name = new Utf8(counter.getName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java:98: incompatible types [javac] found : org.apache.avro.util.Utf8 [javac] required: java.lang.CharSequence [javac] c.displayName = new Utf8(counter.getDisplayName()); [javac] ^ [javac] /hadoop01/jenkins/jobs/hadoop-mapreduce-SNAPSHOT/workspace/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptFinishedEvent.java:56: incompatible types etc. etc. Will remove the UTf8 and submit new patch.
          Hide
          Joep Rottinghuis added a comment -

          Ok, I'll take that back.
          When heeding my own advice in HADOOP-7646 about clearing out the ~/.ivy2/repository cache the avro 1.3.2 jar disappeared from the classpath and the compilation error went away.

          Back to my original question:
          Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent?
          My concern is that the map is defined as CharSequence and we're putting Utf8's in the map. If any client tries to retrieve a key using a string, then they will not get the value back as the key of type Utf8 will not equal String or Charsequence types.

          Show
          Joep Rottinghuis added a comment - Ok, I'll take that back. When heeding my own advice in HADOOP-7646 about clearing out the ~/.ivy2/repository cache the avro 1.3.2 jar disappeared from the classpath and the compilation error went away. Back to my original question: Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent? My concern is that the map is defined as CharSequence and we're putting Utf8's in the map. If any client tries to retrieve a key using a string, then they will not get the value back as the key of type Utf8 will not equal String or Charsequence types.
          Hide
          Doug Cutting added a comment -

          > Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent?

          No. The jobAcls field is only accessed by JobSubmittedEvent. That class is consistent about using Utf8 to access the Map, so results should be correct. Intermixing String and Utf8 as keys in a Map<CharSequence,?> would likely cause problems. (This is a known issue with Avro: AVRO-803.)

          I think the patch looks good as it is. +1

          Show
          Doug Cutting added a comment - > Should I remove the new Utf8(String...) code and put straight Strings into the map in JobSubmittedEvent? No. The jobAcls field is only accessed by JobSubmittedEvent. That class is consistent about using Utf8 to access the Map, so results should be correct. Intermixing String and Utf8 as keys in a Map<CharSequence,?> would likely cause problems. (This is a known issue with Avro: AVRO-803 .) I think the patch looks good as it is. +1
          Hide
          Joep Rottinghuis added a comment -

          Thanks for clarifying and the reference to the existing issue Doug.
          That was indeed the case I had in mind.

          Show
          Joep Rottinghuis added a comment - Thanks for clarifying and the reference to the existing issue Doug. That was indeed the case I had in mind.
          Hide
          Konstantin Shvachko added a comment -

          A slight modification to avoid avro dependency on packaged jars in Hadoop.

          Show
          Konstantin Shvachko added a comment - A slight modification to avoid avro dependency on packaged jars in Hadoop.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12496835/MAPREDUCE-3039-branch-0.22-shv.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/877//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12496835/MAPREDUCE-3039-branch-0.22-shv.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/877//console This message is automatically generated.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this. Thank you Joep.

          Show
          Konstantin Shvachko added a comment - I just committed this. Thank you Joep.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-22-branch #77 (See https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/77/)
          MAPREDUCE-3039. Upgrade to Avro 1.5.3. Contributed by Joep Rottinghuis.

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1176694
          Files :

          • /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/mapreduce/build.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/ivy/libraries.properties
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/capacity-scheduler/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/fairscheduler/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/gridmix/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mrunit/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mumak/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/raid/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/streaming/ivy.xml
          • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java
          • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java
          • /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobSubmittedEvent.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #77 (See https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/77/ ) MAPREDUCE-3039 . Upgrade to Avro 1.5.3. Contributed by Joep Rottinghuis. shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1176694 Files : /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt /hadoop/common/branches/branch-0.22/mapreduce/build.xml /hadoop/common/branches/branch-0.22/mapreduce/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/ivy/libraries.properties /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/capacity-scheduler/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/fairscheduler/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/gridmix/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mrunit/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/mumak/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/raid/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/contrib/streaming/ivy.xml /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventReader.java /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/EventWriter.java /hadoop/common/branches/branch-0.22/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobSubmittedEvent.java

            People

            • Assignee:
              Joep Rottinghuis
              Reporter:
              Joep Rottinghuis
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development