Sqoop
  1. Sqoop
  2. SQOOP-1359

Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.4, 1.4.5
    • Fix Version/s: 1.4.5
    • Component/s: None
    • Labels:
      None

      Description

      AVRO-1170 fixed avro-mapred for hadoop2 and we now have avro-mapred hadoop1 and hadoop2 specific jars. Our dependency should be updated to use the right jars. I recently found similar issues with Hive shipping avro-mapred hadoop1 jars (HIVE-7240) causing issues in certain integration scenarios.

      1. SQOOP-1359-2.patch
        6 kB
        Venkat Ranganathan
      2. SQOOP-1359.patch
        6 kB
        Venkat Ranganathan

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #901 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/901/)
          SQOOP-1359: Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c)

          • ivy.xml
          • build.xml
          • ivy/libraries.properties
          Show
          Hudson added a comment - FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop20 #901 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/901/ ) SQOOP-1359 : Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c ) ivy.xml build.xml ivy/libraries.properties
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1104 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1104/)
          SQOOP-1359: Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c)

          • build.xml
          • ivy.xml
          • ivy/libraries.properties
          Show
          Hudson added a comment - FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1104 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1104/ ) SQOOP-1359 : Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c ) build.xml ivy.xml ivy/libraries.properties
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #907 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/907/)
          SQOOP-1359: Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c)

          • ivy/libraries.properties
          • build.xml
          • ivy.xml
          Show
          Hudson added a comment - FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop200 #907 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/907/ ) SQOOP-1359 : Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c ) ivy/libraries.properties build.xml ivy.xml
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #866 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/866/)
          SQOOP-1359: Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c)

          • ivy/libraries.properties
          • build.xml
          • ivy.xml
          Show
          Hudson added a comment - FAILURE: Integrated in Sqoop-ant-jdk-1.6-hadoop100 #866 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/866/ ) SQOOP-1359 : Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=1d2454d4830756b6c321cacfb35e0026ba38ea3c ) ivy/libraries.properties build.xml ivy.xml
          Hide
          Jarek Jarcec Cecho added a comment -

          Thank you for the patch [~nrv]!

          Show
          Jarek Jarcec Cecho added a comment - Thank you for the patch [~nrv] !
          Hide
          ASF subversion and git services added a comment -

          Commit 1d2454d4830756b6c321cacfb35e0026ba38ea3c in sqoop's branch refs/heads/trunk from Jarek Jarcec Cecho
          [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=1d2454d ]

          SQOOP-1359: Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2

          (Venkat Ranganathan via Jarek Jarcec Cecho)

          Show
          ASF subversion and git services added a comment - Commit 1d2454d4830756b6c321cacfb35e0026ba38ea3c in sqoop's branch refs/heads/trunk from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=1d2454d ] SQOOP-1359 : Fix avro versions in Sqoop to stop shipping hadoop1 jars with hadoop2 (Venkat Ranganathan via Jarek Jarcec Cecho)
          Hide
          Venkat Ranganathan added a comment -

          Created SQOOP-1398 for the ivy version change

          Show
          Venkat Ranganathan added a comment - Created SQOOP-1398 for the ivy version change
          Hide
          Venkat Ranganathan added a comment -

          Good point. I left it while experimenting with ivy versions. I will create a separate JIRA

          Show
          Venkat Ranganathan added a comment - Good point. I left it while experimenting with ivy versions. I will create a separate JIRA
          Hide
          Jarek Jarcec Cecho added a comment -
          -ivy.version=2.1.0
          +ivy.version=2.3.0
          

          I know that it's two line change, but considering that Ivy is crutial part of the build process, can we make the upgrade in separate JIRA, so that it's obvious why&when we've changed it?

          Show
          Jarek Jarcec Cecho added a comment - -ivy.version=2.1.0 +ivy.version=2.3.0 I know that it's two line change, but considering that Ivy is crutial part of the build process, can we make the upgrade in separate JIRA, so that it's obvious why&when we've changed it?
          Hide
          Venkat Ranganathan added a comment -

          Hi Jarek Jarcec Cecho
          I tried the latest revision of avro, 1.7.6, but that has some changes that break Sqoop build (we can fix Sqoop to get over it though). Both 1.7.5 or 1.7.4 should be OK from Sqoop point of view

          I am uploading a patch to the RB with the use of classifier for hadoop2 and hadoop1 using version 1.7.5.

          Please review to see if this is something that is good to proceed with

          Show
          Venkat Ranganathan added a comment - Hi Jarek Jarcec Cecho I tried the latest revision of avro, 1.7.6, but that has some changes that break Sqoop build (we can fix Sqoop to get over it though). Both 1.7.5 or 1.7.4 should be OK from Sqoop point of view I am uploading a patch to the RB with the use of classifier for hadoop2 and hadoop1 using version 1.7.5. Please review to see if this is something that is good to proceed with
          Hide
          Jarek Jarcec Cecho added a comment -

          I think that doing some clean ups during build is acceptable albeit annoying. What version to ship is a good question. Do you know if the APIs that we're using have changed between 1.7.4 and 1.7.5? If not, then I would suggest to use the highest one.

          Show
          Jarek Jarcec Cecho added a comment - I think that doing some clean ups during build is acceptable albeit annoying. What version to ship is a good question. Do you know if the APIs that we're using have changed between 1.7.4 and 1.7.5? If not, then I would suggest to use the highest one.
          Hide
          Venkat Ranganathan added a comment -

          This is getting tricky from the build point of view. Some projects do not use hadoop2/hadoop1 classifier (and are not dependent on avro-mapred.jar hadoop1/hadoop2 specifiic classes). So it is more likely that you have in your ivy and m2 cache, the avro-mapred.jar cached. I added the classifier for hadoop2 builds, but the build fails when the cached artifact in the local ivy and m2 caches do not have the right classifier. We have to clean the avro-mapred cached jars in m2 and ivy caches to make the sqoop build successful and this may affect building other components. I tried to update ivy to the latest release version 2.3.0 but still ivy does not download the jar with hadoop2 classifier if it finds something without the classifier in the m2 cache.

          Looking at Sqoop, also we don't really depend on avro-mapred hadoop1/hadoop2 dependent classes (those that deal with TaskAttamptContexts).

          I think we have two options:
          Just update avro version alone without specifying classifiers (which avro version would be question - 1.7.4 is used by hadoop, and 1.7.5 is used by hive and they are some incompatibilities between the versions).

          Or recommend users to nuke the avro-mapred cached jars before building sqoop and any other avro-mapred dependent builds.

          Thoughts?

          Show
          Venkat Ranganathan added a comment - This is getting tricky from the build point of view. Some projects do not use hadoop2/hadoop1 classifier (and are not dependent on avro-mapred.jar hadoop1/hadoop2 specifiic classes). So it is more likely that you have in your ivy and m2 cache, the avro-mapred.jar cached. I added the classifier for hadoop2 builds, but the build fails when the cached artifact in the local ivy and m2 caches do not have the right classifier. We have to clean the avro-mapred cached jars in m2 and ivy caches to make the sqoop build successful and this may affect building other components. I tried to update ivy to the latest release version 2.3.0 but still ivy does not download the jar with hadoop2 classifier if it finds something without the classifier in the m2 cache. Looking at Sqoop, also we don't really depend on avro-mapred hadoop1/hadoop2 dependent classes (those that deal with TaskAttamptContexts). I think we have two options: Just update avro version alone without specifying classifiers (which avro version would be question - 1.7.4 is used by hadoop, and 1.7.5 is used by hive and they are some incompatibilities between the versions). Or recommend users to nuke the avro-mapred cached jars before building sqoop and any other avro-mapred dependent builds. Thoughts?

            People

            • Assignee:
              Venkat Ranganathan
              Reporter:
              Venkat Ranganathan
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development