Hive
  1. Hive
  2. HIVE-726

Make "ant package -Dhadoop.version=0.17.0" work

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0, 0.5.0
    • Component/s: Build Infrastructure
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      HIVE-726. Make "ant package -Dhadoop.version=0.17.0" work. (Todd Lipcon via zshao)
      Show
      HIVE-726 . Make "ant package -Dhadoop.version=0.17.0" work. (Todd Lipcon via zshao)

      Description

      After HIVE-487, users will have to specify the versions as in "shims/ivy.xml" to make "ant package -Dhadoop.version=<version>".work.
      Currently it is only running fine with the following versions (from shims/ivy.xml): 0.17.2.1, 0.18.3, 0.19.0, 0.20.0.

      We used to do "ant package -Dhadoop.version=0.17.0" but it's not working any more, although we can specify "ant package -Dhadoop.version=0.17.2.1" and the package will probably still work with 0.17.0.

      1. hive-726.txt
        9 kB
        Todd Lipcon
      2. HIVE-726.3.patch
        0.5 kB
        Zheng Shao
      3. HIVE-726.2.patch
        2 kB
        Zheng Shao
      4. HIVE-726.1.patch
        4 kB
        Zheng Shao

        Issue Links

          Activity

          Hide
          Zheng Shao added a comment -

          Talked with Ashish offline on this. Assuming that API does not change within major versions, we can always download major-version.0 no matter which hadoop minor version the user specified.
          The compiled hive package should still work with any hadoop minor versions within the same major version. This should be better for our users because users can still specify any hadoop versions on the command line - we just strip it off to get the major version and append ".0".

          Thoughts?

          Show
          Zheng Shao added a comment - Talked with Ashish offline on this. Assuming that API does not change within major versions, we can always download major-version.0 no matter which hadoop minor version the user specified. The compiled hive package should still work with any hadoop minor versions within the same major version. This should be better for our users because users can still specify any hadoop versions on the command line - we just strip it off to get the major version and append ".0". Thoughts?
          Hide
          Zheng Shao added a comment -

          HIVE-726.1.patch: This patch allows us to build 0.17.0, 0.18.0, 0.19.0, 0.20.0.

          However it does not work for other versions because I could not find a way to compute hadoop.version.0 beforehand, and then use it for expansion in build.properties.
          The "getversionpref" is only available after "deploy-ant-tasks" in build-common.xml is done.

          If anybody is interested in going this route, feel free to take HIVE-726.1.patch and continue in this direction.

          I will try a different approach by enforcing our users to use only major versions, and we already blindly append ".0" to it.

          Show
          Zheng Shao added a comment - HIVE-726 .1.patch: This patch allows us to build 0.17.0, 0.18.0, 0.19.0, 0.20.0. However it does not work for other versions because I could not find a way to compute hadoop.version.0 beforehand, and then use it for expansion in build.properties. The "getversionpref" is only available after "deploy-ant-tasks" in build-common.xml is done. If anybody is interested in going this route, feel free to take HIVE-726 .1.patch and continue in this direction. I will try a different approach by enforcing our users to use only major versions, and we already blindly append ".0" to it.
          Hide
          Zheng Shao added a comment -

          HIVE-726.2.patch:

          Please ignore all comments above. HIVE-726.2.patch simply replaced the hadoop version with that specified by the user in shims/build.xml and shims/ivy.xml.

          All these commands are successful now:

          ant package -Dhadoop.version=0.17.0
          ant package -Dhadoop.version=0.17.2.1
          ant package -Dhadoop.version=0.20.0
          
          Show
          Zheng Shao added a comment - HIVE-726 .2.patch: Please ignore all comments above. HIVE-726 .2.patch simply replaced the hadoop version with that specified by the user in shims/build.xml and shims/ivy.xml. All these commands are successful now: ant package -Dhadoop.version=0.17.0 ant package -Dhadoop.version=0.17.2.1 ant package -Dhadoop.version=0.20.0
          Hide
          Todd Lipcon added a comment -

          I think there is some confusion about how the shims work.

          The point of the new shims package as done in HIVE-487 is that, on every build, the shims are build for all major versions of Hadoop. The versions in the ivy.xml there were picked as the most recent versions, but as Ashish said, we could use the .0 releases instead - it's somewhat arbitrary. After the shims have been built with each of the hadoop versions, it makes a hive_shims.jar which includes all of the implementations.

          At this point, when Hadoop is built, it can be built against any version of Hadoop using the -Dhadoop.version flag. In actuality this should not matter - it may be that there is still some work yet to be done, but in my testing I build with the default hadoop.version (0.19.0) and then ran the build products against 18 and 20 with no recompile. The ShimLoader class determines the current hadoop version at runtime and loads the correct implementation class out of hive_shims.jar.

          So, -1 on the patch, since it would result in a hive_shims.jar that only includes one version of the shims, and thus the build product would only work on that version of Hadoop.

          Show
          Todd Lipcon added a comment - I think there is some confusion about how the shims work. The point of the new shims package as done in HIVE-487 is that, on every build, the shims are build for all major versions of Hadoop. The versions in the ivy.xml there were picked as the most recent versions, but as Ashish said, we could use the .0 releases instead - it's somewhat arbitrary. After the shims have been built with each of the hadoop versions, it makes a hive_shims.jar which includes all of the implementations. At this point, when Hadoop is built, it can be built against any version of Hadoop using the -Dhadoop.version flag. In actuality this should not matter - it may be that there is still some work yet to be done, but in my testing I build with the default hadoop.version (0.19.0) and then ran the build products against 18 and 20 with no recompile. The ShimLoader class determines the current hadoop version at runtime and loads the correct implementation class out of hive_shims.jar. So, -1 on the patch, since it would result in a hive_shims.jar that only includes one version of the shims, and thus the build product would only work on that version of Hadoop.
          Hide
          Zheng Shao added a comment -

          HIVE-726.3.patch: This patch is even simpler. just add the hadoop.version that user specified for ivy to download.
          shims will be compiled as it was, but other parts of hive will be compiled against the version that the user specifies.
          I've verified all the following command runs fine:

          ant -Dhadoop.version=0.17.0 clean package
          ant -Dhadoop.version=0.17.2.1 clean package
          ant -Dhadoop.version=0.19.0 clean package
          ant -Dhadoop.version=0.20.0 clean package
          
          Show
          Zheng Shao added a comment - HIVE-726 .3.patch: This patch is even simpler. just add the hadoop.version that user specified for ivy to download. shims will be compiled as it was, but other parts of hive will be compiled against the version that the user specifies. I've verified all the following command runs fine: ant -Dhadoop.version=0.17.0 clean package ant -Dhadoop.version=0.17.2.1 clean package ant -Dhadoop.version=0.19.0 clean package ant -Dhadoop.version=0.20.0 clean package
          Hide
          Todd Lipcon added a comment -

          I think there was still an issue which you've uncovered. ant's semantics for the <param> element inside <antcall> are somewhat screwy - you can't override something that has been passed on the command line. Since you guys were passing hadoop.version on the command line, it was actually compiling the shims for 17 four times rather than compiling each version, as best I can tell.

          Attaching a patch which gets around this by introducing a new ant property called "hadoop.version.ant-internal". This property is set in build.properties to default to hadoop.version, and hadoop.version is left as is. Everywhere in the build files that used to reference hadoop.version now references hadoop.version.ant-internal. Since we're not specifying this on the command line, the <antcall>s inside shims/build.xml can properly override it.

          One question: I think the condition in build.xml's eclipsefiles targets that defaults to 0.19 is now dead code, since the property itself is defaulted to 0.19.0.

          Also, can we change $

          {final.name}

          to simply $

          {name}

          -$

          {version}

          since one build can work with any hadoop?

          Show
          Todd Lipcon added a comment - I think there was still an issue which you've uncovered. ant's semantics for the <param> element inside <antcall> are somewhat screwy - you can't override something that has been passed on the command line. Since you guys were passing hadoop.version on the command line, it was actually compiling the shims for 17 four times rather than compiling each version, as best I can tell. Attaching a patch which gets around this by introducing a new ant property called "hadoop.version.ant-internal". This property is set in build.properties to default to hadoop.version, and hadoop.version is left as is. Everywhere in the build files that used to reference hadoop.version now references hadoop.version.ant-internal. Since we're not specifying this on the command line, the <antcall>s inside shims/build.xml can properly override it. One question: I think the condition in build.xml's eclipsefiles targets that defaults to 0.19 is now dead code, since the property itself is defaulted to 0.19.0. Also, can we change $ {final.name} to simply $ {name} -$ {version} since one build can work with any hadoop?
          Hide
          Namit Jain added a comment -

          I think we need this for 0.4 release

          Show
          Namit Jain added a comment - I think we need this for 0.4 release
          Hide
          Zheng Shao added a comment -

          Todd, when I opened this issue I didn't know the way shims work (as you commented above).

          When I first saw that, it didn't work for me - see HIVE-812.
          But with HIVE-812 I think it should work as you planned.

          I think this JIRA still has value in that it allows user to choose which hadoop version to compile against - however, if shims work as expected, the value of doing this would be minimal.

          I would treat this as a nice-to-have (non-blocker) for branch-0.4, as long as HIVE-812 is in, and we inform users to remove the "-Dhadoop.version=xxx" from all commands lines and wikis.
          This is a good simplification for our users I think.

          Thoughts?

          Show
          Zheng Shao added a comment - Todd, when I opened this issue I didn't know the way shims work (as you commented above). When I first saw that, it didn't work for me - see HIVE-812 . But with HIVE-812 I think it should work as you planned. I think this JIRA still has value in that it allows user to choose which hadoop version to compile against - however, if shims work as expected, the value of doing this would be minimal. I would treat this as a nice-to-have (non-blocker) for branch-0.4, as long as HIVE-812 is in, and we inform users to remove the "-Dhadoop.version=xxx" from all commands lines and wikis. This is a good simplification for our users I think. Thoughts?
          Hide
          Todd Lipcon added a comment -

          Yea, I think this is a nice-to-have but not complete blocker. Its value is that you can ensure that you are still compatible with future releases. Since we didn't shim the entirety of Hadoop, it's possible to compile against the default fine, but still break at runtime if you're relying on an API that doesn't exist in other versions. Occasionally compiling against all of the "supported" versions is a good sanity check - if possible we should probably set up Hudson tests that check against all of the supported Hadoop versions on a nightly basis.

          Show
          Todd Lipcon added a comment - Yea, I think this is a nice-to-have but not complete blocker. Its value is that you can ensure that you are still compatible with future releases. Since we didn't shim the entirety of Hadoop, it's possible to compile against the default fine, but still break at runtime if you're relying on an API that doesn't exist in other versions. Occasionally compiling against all of the "supported" versions is a good sanity check - if possible we should probably set up Hudson tests that check against all of the supported Hadoop versions on a nightly basis.
          Hide
          Zheng Shao added a comment -

          Also, can we change ${final.name} to simply ${name}-${version} since one build can work with any hadoop?

          Agree.

          Todd, can you make that change? Otherwise I think it's good to commit.

          Show
          Zheng Shao added a comment - Also, can we change ${final.name} to simply ${name}-${version} since one build can work with any hadoop? Agree. Todd, can you make that change? Otherwise I think it's good to commit.
          Hide
          Todd Lipcon added a comment -
          Show
          Todd Lipcon added a comment - See HIVE-799
          Hide
          Zheng Shao added a comment -

          Committed. Thanks Todd!

          Show
          Zheng Shao added a comment - Committed. Thanks Todd!

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Zheng Shao
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development