Hive
  1. Hive
  2. HIVE-1264

Make Hive work with Hadoop security

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: Security
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    1. hive-1264.txt
      35 kB
      Todd Lipcon
    2. hive-1264-fb-mirror.txt
      35 kB
      Todd Lipcon
    3. hive-1264.txt
      32 kB
      Todd Lipcon
    4. HiveHadoop20S_patch.patch
      3 kB
      Venkatesh Seetharam

      Issue Links

        Activity

        Hide
        John Sichi added a comment -

        Committed. Thanks Todd!

        Show
        John Sichi added a comment - Committed. Thanks Todd!
        Hide
        John Sichi added a comment -

        I'm back from vacation; I'll retry tests with the latest patch.

        Show
        John Sichi added a comment - I'm back from vacation; I'll retry tests with the latest patch.
        Hide
        Pradeep Kamath added a comment -

        The new patch does address the failed test - TestHBaseMinimrCliDriver. I have used this patch against a certain version of hadoop security and have had success running queries. It would be good if this patch can be committed soon since HIVE-1526 and HIVE-842 also depend on this. Aside from that, it would be a great feature to be able to work with hadoop security. Any chance this can be committed soon? Thanks!

        Show
        Pradeep Kamath added a comment - The new patch does address the failed test - TestHBaseMinimrCliDriver. I have used this patch against a certain version of hadoop security and have had success running queries. It would be good if this patch can be committed soon since HIVE-1526 and HIVE-842 also depend on this. Aside from that, it would be a great feature to be able to work with hadoop security. Any chance this can be committed soon? Thanks!
        Hide
        Todd Lipcon added a comment -

        Good catch. This patch updates the build.xml for hbase-handler to include the hadoop test jar.

        Show
        Todd Lipcon added a comment - Good catch. This patch updates the build.xml for hbase-handler to include the hadoop test jar.
        Hide
        Ashutosh Chauhan added a comment -

        @Todd,

        Any progress on this? Some of the utility methods introduced in this patch will be useful for HIVE-1667

        Show
        Ashutosh Chauhan added a comment - @Todd, Any progress on this? Some of the utility methods introduced in this patch will be useful for HIVE-1667
        Hide
        John Sichi added a comment -

        Todd, I'm getting one test failure when running ant test. Probably hbase-handler/build.xml needs a fix. You should be able to repro it with

        ant test -Dtestcase=TestHBaseMinimrCliDriver

        testCliDriver_hbase_bulk Error org/apache/hadoop/hdfs/MiniDFSCluster

        java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/MiniDFSCluster
        at org.apache.hadoop.hive.shims.Hadoop20Shims.getMiniDfs(Hadoop20Shims.java:90)
        at org.apache.hadoop.hive.ql.QTestUtil.<init>(QTestUtil.java:224)
        at org.apache.hadoop.hive.hbase.HBaseQTestUtil.<init>(HBaseQTestUtil.java:30)
        at org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.setUp(TestHBaseMinimrCliDriver.java:43)
        at junit.framework.TestCase.runBare(TestCase.java:125)
        at junit.framework.TestResult$1.protect(TestResult.java:106)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.framework.TestResult.run(TestResult.java:109)
        at junit.framework.TestCase.run(TestCase.java:118)
        at junit.framework.TestSuite.runTest(TestSuite.java:208)
        at junit.framework.TestSuite.run(TestSuite.java:203)
        at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
        at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.extensions.TestSetup.run(TestSetup.java:23)
        at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
        at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
        at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)
        Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.MiniDFSCluster
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)

        Show
        John Sichi added a comment - Todd, I'm getting one test failure when running ant test. Probably hbase-handler/build.xml needs a fix. You should be able to repro it with ant test -Dtestcase=TestHBaseMinimrCliDriver testCliDriver_hbase_bulk Error org/apache/hadoop/hdfs/MiniDFSCluster java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/MiniDFSCluster at org.apache.hadoop.hive.shims.Hadoop20Shims.getMiniDfs(Hadoop20Shims.java:90) at org.apache.hadoop.hive.ql.QTestUtil.<init>(QTestUtil.java:224) at org.apache.hadoop.hive.hbase.HBaseQTestUtil.<init>(HBaseQTestUtil.java:30) at org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.setUp(TestHBaseMinimrCliDriver.java:43) at junit.framework.TestCase.runBare(TestCase.java:125) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.MiniDFSCluster at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
        Hide
        John Sichi added a comment -

        +1. Will commit when tests pass.

        Show
        John Sichi added a comment - +1. Will commit when tests pass.
        Hide
        John Sichi added a comment -

        Mirror is up; Todd, could you test it and then update the patch with the location?

        http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-0.20.3-CDH3-SNAPSHOT

        Show
        John Sichi added a comment - Mirror is up; Todd, could you test it and then update the patch with the location? http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-0.20.3-CDH3-SNAPSHOT
        Hide
        John Sichi added a comment -

        Todd has added the .md5/.asc files; I'm working on getting the mirror set up.

        Show
        John Sichi added a comment - Todd has added the .md5/.asc files; I'm working on getting the mirror set up.
        Hide
        HBase Review Board added a comment -

        Message from: "Carl Steinbach" <carl@cloudera.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/860/#review1251
        -----------------------------------------------------------

        Ship it!

        +1 Looks good to me.

        build.properties
        <http://review.cloudera.org/r/860/#comment4223>

        If this is the convention going forward then it's probably more appropriate to rename the old style as "oldstyle-name" instead of introducing a "newstyle-name".

        • Carl
        Show
        HBase Review Board added a comment - Message from: "Carl Steinbach" <carl@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/860/#review1251 ----------------------------------------------------------- Ship it! +1 Looks good to me. build.properties < http://review.cloudera.org/r/860/#comment4223 > If this is the convention going forward then it's probably more appropriate to rename the old style as "oldstyle-name" instead of introducing a "newstyle-name". Carl
        Hide
        HBase Review Board added a comment -

        Message from: "Todd Lipcon" <todd@cloudera.com>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.cloudera.org/r/860/
        -----------------------------------------------------------

        Review request for Hive Developers and John Sichi.

        Summary
        -------

        Adds a shim layer for secure Hadoop, currently pulling a secure CDH3b3 prerelease snapshot

        This addresses bug HIVE-1264.
        http://issues.apache.org/jira/browse/HIVE-1264

        Diffs


        build-common.xml 0b76688
        build.properties 3e392f7
        build.xml 4b345b5
        common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d2c7123
        ql/src/java/org/apache/hadoop/hive/ql/Driver.java b2966de
        shims/build.xml b339871
        shims/ivy.xml de56e4f
        shims/src/0.17/java/org/apache/hadoop/hive/shims/Hadoop17Shims.java 17110ab
        shims/src/0.18/java/org/apache/hadoop/hive/shims/Hadoop18Shims.java 9cc8d56
        shims/src/0.19/java/org/apache/hadoop/hive/shims/Hadoop19Shims.java c643108
        shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 0675a79
        shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java PRE-CREATION
        shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java PRE-CREATION
        shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 4310942
        shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java c847d69

        Diff: http://review.cloudera.org/r/860/diff

        Testing
        -------

        Able to run MR jobs on our secure cluster with standalone (ie no separate metastore, etc)

        Thanks,

        Todd

        Show
        HBase Review Board added a comment - Message from: "Todd Lipcon" <todd@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/860/ ----------------------------------------------------------- Review request for Hive Developers and John Sichi. Summary ------- Adds a shim layer for secure Hadoop, currently pulling a secure CDH3b3 prerelease snapshot This addresses bug HIVE-1264 . http://issues.apache.org/jira/browse/HIVE-1264 Diffs build-common.xml 0b76688 build.properties 3e392f7 build.xml 4b345b5 common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d2c7123 ql/src/java/org/apache/hadoop/hive/ql/Driver.java b2966de shims/build.xml b339871 shims/ivy.xml de56e4f shims/src/0.17/java/org/apache/hadoop/hive/shims/Hadoop17Shims.java 17110ab shims/src/0.18/java/org/apache/hadoop/hive/shims/Hadoop18Shims.java 9cc8d56 shims/src/0.19/java/org/apache/hadoop/hive/shims/Hadoop19Shims.java c643108 shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 0675a79 shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java PRE-CREATION shims/src/0.20S/java/org/apache/hadoop/hive/shims/Jetty20SShims.java PRE-CREATION shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 4310942 shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java c847d69 Diff: http://review.cloudera.org/r/860/diff Testing ------- Able to run MR jobs on our secure cluster with standalone (ie no separate metastore, etc) Thanks, Todd
        Hide
        John Sichi added a comment -

        See here for existing checksum file conventions:

        http://mirror.facebook.net/facebook/hive-deps/hadoop/core/

        Show
        John Sichi added a comment - See here for existing checksum file conventions: http://mirror.facebook.net/facebook/hive-deps/hadoop/core/
        Hide
        John Sichi added a comment -

        OK, can you add a checksum file to your directory, and then I'll ask our ops to create the mirror? Once that's done, we'll need one more patch which references the FB location as default.

        Show
        John Sichi added a comment - OK, can you add a checksum file to your directory, and then I'll ask our ops to create the mirror? Once that's done, we'll need one more patch which references the FB location as default.
        Hide
        Todd Lipcon added a comment -

        Submitted to RB: https://review.cloudera.org/r/860/

        Regarding the snapshot - it's fine by me to pull from there, I think the people.apache.org web server is reasonably stable. If it turns out to be flaky it's also cool if you want to mirror it - FB is probably more reliable than ASF.

        Show
        Todd Lipcon added a comment - Submitted to RB: https://review.cloudera.org/r/860/ Regarding the snapshot - it's fine by me to pull from there, I think the people.apache.org web server is reasonably stable. If it turns out to be flaky it's also cool if you want to mirror it - FB is probably more reliable than ASF.
        Hide
        John Sichi added a comment -

        Hey Todd,

        I think you mentioned at the contributor meetup the other day that this is ready for review. If so, click the Submit Patch button and create a reviewboard entry.

        I just did a quick check to verify that I could apply the patch and run ant package+test (without changing hadoop.version) and ivy was able to fetch the dependency from your snapshot successfully.

        If we commit it as is, every hive developer is going to automatically start pulling from that snapshot by default. Is that OK, or should I try to get a copy hosted at http://mirror.facebook.net/facebook/hive-deps?

        Show
        John Sichi added a comment - Hey Todd, I think you mentioned at the contributor meetup the other day that this is ready for review. If so, click the Submit Patch button and create a reviewboard entry. I just did a quick check to verify that I could apply the patch and run ant package+test (without changing hadoop.version) and ivy was able to fetch the dependency from your snapshot successfully. If we commit it as is, every hive developer is going to automatically start pulling from that snapshot by default. Is that OK, or should I try to get a copy hosted at http://mirror.facebook.net/facebook/hive-deps?
        Hide
        Todd Lipcon added a comment -

        (btw, if you want to test this against a different tarball, you can use -Dhadoop.security.url=http://url/of/your/tarball -Dhadoop.security.version=0.20.104 or whatever.)

        Show
        Todd Lipcon added a comment - (btw, if you want to test this against a different tarball, you can use -Dhadoop.security.url= http://url/of/your/tarball -Dhadoop.security.version=0.20.104 or whatever. )
        Hide
        Todd Lipcon added a comment -

        Here's a patch against trunk which adds shims for secure hadoop.

        Since there hasn't been a public tarball release of secure hadoop quite yet, I've pointed it at a snapshot of CDH3b3 (not yet released) from my apache.org web directory.

        I haven't run the unit test suite against secure hadoop yet, but I did a very brief test on a secure cluster by creating a table and running a simple MR query.

        Show
        Todd Lipcon added a comment - Here's a patch against trunk which adds shims for secure hadoop. Since there hasn't been a public tarball release of secure hadoop quite yet, I've pointed it at a snapshot of CDH3b3 (not yet released) from my apache.org web directory. I haven't run the unit test suite against secure hadoop yet, but I did a very brief test on a secure cluster by creating a table and running a simple MR query.
        Hide
        Ashish Thusoo added a comment -

        Can these changes be packed in the shims layer. So all the calls can be replaced with a call to shims with the shim for 20.1xx doing the right thing.

        Show
        Ashish Thusoo added a comment - Can these changes be packed in the shims layer. So all the calls can be replaced with a call to shims with the shim for 20.1xx doing the right thing.
        Hide
        Venkatesh Seetharam added a comment -

        This will work against 20.1xx branch. You need to include the 20.1xx hadoop dependency and it does compile and run. The interface contract does not change and hence not sure if I need to change the shim. UGI has changed in 20S and UnixUGI class is no more.

        Please suggest how to proceed with this incompatible change.

        Show
        Venkatesh Seetharam added a comment - This will work against 20.1xx branch. You need to include the 20.1xx hadoop dependency and it does compile and run. The interface contract does not change and hence not sure if I need to change the shim. UGI has changed in 20S and UnixUGI class is no more. Please suggest how to proceed with this incompatible change.
        Hide
        Carl Steinbach added a comment -

        @Venkatesh: I looked at your patch. Your changes break the build when compiled against any version of Hadoop that does not include the new security APIs. Hive is currently designed to maintain backward compatibility with Hadoop 0.17, 0.18, 0.19 and 0.20 using a shim layer. In order to get this patch committed you will need to modify the shim layer and ensure that you have not broken compatibility with any of these older versions of Hadoop.

        Show
        Carl Steinbach added a comment - @Venkatesh: I looked at your patch. Your changes break the build when compiled against any version of Hadoop that does not include the new security APIs. Hive is currently designed to maintain backward compatibility with Hadoop 0.17, 0.18, 0.19 and 0.20 using a shim layer. In order to get this patch committed you will need to modify the shim layer and ensure that you have not broken compatibility with any of these older versions of Hadoop.
        Hide
        Carl Steinbach added a comment -

        @Venkatesh: It doesn't look like you attached a patch file.

        Show
        Carl Steinbach added a comment - @Venkatesh: It doesn't look like you attached a patch file.
        Hide
        Venkatesh Seetharam added a comment -

        Patch for H20S

        Show
        Venkatesh Seetharam added a comment - Patch for H20S

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Jeff Hammerbacher
          • Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development