Hive
  1. Hive
  2. HIVE-98

Dependency management with hadoop core using either maven or ivy

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.3.0
    • Component/s: Build Infrastructure
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We need to move from pre packaging jars to managing external dependencies with hadoop core (and later other packaged jars) using maven or ivy.

      1. patch-98_2.txt
        165 kB
        Ashish Thusoo
      2. patch-98_3.txt
        192 kB
        Ashish Thusoo
      3. patch-98.txt
        8 kB
        Ashish Thusoo

        Activity

        Hide
        Ashish Thusoo added a comment -

        Initial patch to integrate hive with hadoop dependencies using Ivy.

        Though this patch does not show it, but with this we can eliminate the need for maintaining hadoopcore within the hive source tree.

        I need to add some fixes so that this will seemlessly compile with hadoop-0.17 and hadoop-0.18. Also will add the license text on the new files.

        Thoughts?

        Show
        Ashish Thusoo added a comment - Initial patch to integrate hive with hadoop dependencies using Ivy. Though this patch does not show it, but with this we can eliminate the need for maintaining hadoopcore within the hive source tree. I need to add some fixes so that this will seemlessly compile with hadoop-0.17 and hadoop-0.18. Also will add the license text on the new files. Thoughts?
        Hide
        Raghotham Murthy added a comment -

        Looks like the patch was not generated from the hive top level dir. I had to do patch -p1 to apply the patch. Also missing ivy directory.

        Show
        Raghotham Murthy added a comment - Looks like the patch was not generated from the hive top level dir. I had to do patch -p1 to apply the patch. Also missing ivy directory.
        Hide
        Ashish Thusoo added a comment -

        Attaching a new file. I did not use the correct command with git to generate the diff.

        Show
        Ashish Thusoo added a comment - Attaching a new file. I did not use the correct command with git to generate the diff.
        Hide
        Ashish Thusoo added a comment -

        Added the following things:

        1. ivy dependencies on hadoop so we do not have to package hadoop jars, bin and conf files along with hive. With Ivy, the hadoop version can be downloaded automatically. In order to compile with say 0.18.0 you have to specify

        ant -Dhadoop.version="0.18.0"

        Currently the mirror used for downloads is
        http://archive.apache.org/dist

        This can also be configured using -Dhadoop.mirror="url"

        2. Added ant based preprocessing so we can exclude code which is incompatible with a certain version of hadoop. The only file that does this right now is HiveInputFormat.java which excludes validateInput method while compiling with 0.19.*. The exclusion macros are defined in ql/build.xml (we may later move this to build-common, if we see cases of conditional compilation in other parts of the hive code - for now it was convenient for me to reuse gen-java directories in ql/build.xml in order to store the preprocessed code).

        3. Fixed certain ordering sensitive tests so that they are not dependent on the different behavior of reducer merge algorithm when the reduce key is random.

        4. Another minor change is in HiveConf.java, where in the hadoop conf directory is also picked relative to HADOOP_HOME

        Open issues:
        With 0.17 and 0.18 the following tests still fail (input16_cc.q, input16.q and input3.q). I will open a separate JIRA to address those as those are related to how we discover user defined serdes and user defined functions from aux.jars.

        Also once those are fixed, we can add another target in ant (test-long) so that a transaction can be regressed with all versions of hadoop - and this can be run in hudson to validate submitted patches.

        Show
        Ashish Thusoo added a comment - Added the following things: 1. ivy dependencies on hadoop so we do not have to package hadoop jars, bin and conf files along with hive. With Ivy, the hadoop version can be downloaded automatically. In order to compile with say 0.18.0 you have to specify ant -Dhadoop.version="0.18.0" Currently the mirror used for downloads is http://archive.apache.org/dist This can also be configured using -Dhadoop.mirror="url" 2. Added ant based preprocessing so we can exclude code which is incompatible with a certain version of hadoop. The only file that does this right now is HiveInputFormat.java which excludes validateInput method while compiling with 0.19.*. The exclusion macros are defined in ql/build.xml (we may later move this to build-common, if we see cases of conditional compilation in other parts of the hive code - for now it was convenient for me to reuse gen-java directories in ql/build.xml in order to store the preprocessed code). 3. Fixed certain ordering sensitive tests so that they are not dependent on the different behavior of reducer merge algorithm when the reduce key is random. 4. Another minor change is in HiveConf.java, where in the hadoop conf directory is also picked relative to HADOOP_HOME Open issues: With 0.17 and 0.18 the following tests still fail (input16_cc.q, input16.q and input3.q). I will open a separate JIRA to address those as those are related to how we discover user defined serdes and user defined functions from aux.jars. Also once those are fixed, we can add another target in ant (test-long) so that a transaction can be regressed with all versions of hadoop - and this can be run in hudson to validate submitted patches.
        Hide
        Zheng Shao added a comment -

        Committed revision 724473.

        Show
        Zheng Shao added a comment - Committed revision 724473.
        Hide
        Zheng Shao added a comment -

        HIVE-98. Dependency management with hadoop core using ivy. (Ashish Thusoo through zshao)

        Show
        Zheng Shao added a comment - HIVE-98 . Dependency management with hadoop core using ivy. (Ashish Thusoo through zshao)
        Hide
        Joydeep Sen Sarma added a comment -

        [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: http://ant.apache.org/ivy/ ::
        :: loading settings :: file = /mnt/vol/devrs004.snc1/jssarma/projects/hive-129/ivy/ivysettings.xml
        [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#common;working@devrs004.snc1.facebook.com
        [ivy:retrieve] confs: [default]
        [ivy:retrieve] :: resolution report :: resolve 166ms :: artifacts dl 0ms
        ---------------------------------------------------------------------

          modules artifacts
        conf number search dwnlded evicted number dwnlded

        ---------------------------------------------------------------------

        default 1 0 0 0 0 0

        ---------------------------------------------------------------------

        [ivy:retrieve] :: problems summary ::
        [ivy:retrieve] :::: WARNINGS
        [ivy:retrieve] module not found: hadoop#core;0.17
        [ivy:retrieve] ==== hadoop-resolver: tried
        [ivy:retrieve] – artifact hadoop#core;0.17!hadoop.tar.gz(source):
        [ivy:retrieve] http://archive.apache.org/dist/hadoop/core/hadoop-0.17/hadoop-0.17.tar.gz
        [ivy:retrieve] ::::::::::::::::::::::::::::::::::::::::::::::
        [ivy:retrieve] :: UNRESOLVED DEPENDENCIES ::
        [ivy:retrieve] ::::::::::::::::::::::::::::::::::::::::::::::
        [ivy:retrieve] :: hadoop#core;0.17: not found
        [ivy:retrieve] ::::::::::::::::::::::::::::::::::::::::::::::
        [ivy:retrieve]
        [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

        BUILD FAILED
        /mnt/vol/devrs004.snc1/jssarma/projects/hive-129/build.xml:80: The following error occurred while executing this line:
        /mnt/vol/devrs004.snc1/jssarma/projects/hive-129/build-common.xml:82: impossible to resolve dependencies:
        resolve failed - see output for details

        Total time: 2 seconds

        Show
        Joydeep Sen Sarma added a comment - [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: http://ant.apache.org/ivy/ :: :: loading settings :: file = /mnt/vol/devrs004.snc1/jssarma/projects/hive-129/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#common;working@devrs004.snc1.facebook.com [ivy:retrieve] confs: [default] [ivy:retrieve] :: resolution report :: resolve 166ms :: artifacts dl 0ms ---------------------------------------------------------------------   modules artifacts conf number search dwnlded evicted number dwnlded --------------------------------------------------------------------- default 1 0 0 0 0 0 --------------------------------------------------------------------- [ivy:retrieve] :: problems summary :: [ivy:retrieve] :::: WARNINGS [ivy:retrieve] module not found: hadoop#core;0.17 [ivy:retrieve] ==== hadoop-resolver: tried [ivy:retrieve] – artifact hadoop#core;0.17!hadoop.tar.gz(source): [ivy:retrieve] http://archive.apache.org/dist/hadoop/core/hadoop-0.17/hadoop-0.17.tar.gz [ivy:retrieve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:retrieve] :: UNRESOLVED DEPENDENCIES :: [ivy:retrieve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:retrieve] :: hadoop#core;0.17: not found [ivy:retrieve] :::::::::::::::::::::::::::::::::::::::::::::: [ivy:retrieve] [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS BUILD FAILED /mnt/vol/devrs004.snc1/jssarma/projects/hive-129/build.xml:80: The following error occurred while executing this line: /mnt/vol/devrs004.snc1/jssarma/projects/hive-129/build-common.xml:82: impossible to resolve dependencies: resolve failed - see output for details Total time: 2 seconds
        Hide
        Ashish Thusoo added a comment -

        You have to say

        ant -Dhadoop.version="0.17.0"

        instead of just

        ant -Dhadoop.version="0.17"

        you have to specify the complete hadoop version.

        Show
        Ashish Thusoo added a comment - You have to say ant -Dhadoop.version="0.17.0" instead of just ant -Dhadoop.version="0.17" you have to specify the complete hadoop version.
        Hide
        Ashish Thusoo added a comment -

        closing this as this is working with the instructions that I posted.

        Show
        Ashish Thusoo added a comment - closing this as this is working with the instructions that I posted.

          People

          • Assignee:
            Ashish Thusoo
            Reporter:
            Ashish Thusoo
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development