Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.3.0
    • Fix Version/s: 0.6.0
    • Component/s: impl
    • Labels:
      None
    • Patch Info:
      Patch Available
    • Tags:
      hbase

      Description

      The support of HBase is currently very limited and restricted to HBase 0.18.0.
      Because the next releases of PIG will support Hadoop 0.20.0, they should also support HBase 0.20.0.

      1. zookeeper-hbase-1329.jar
        1.06 MB
        Jeff Zhang
      2. test-output.tgz
        37 kB
        Alan Gates
      3. TEST-org.apache.pig.test.TestHBaseStorage.txt
        36 kB
        Alan Gates
      4. TEST-org.apache.pig.test.TestHBaseStorage.txt
        161 kB
        Vincent BARAT
      5. TEST-org.apache.pig.test.TestHBaseStorage.txt
        280 kB
        Alan Gates
      6. pig-hbase-20-v2.patch
        5 kB
        Alan Gates
      7. pig-hbase-0.20.0-support.patch
        3 kB
        Vincent BARAT
      8. Pig_HBase_0.20.0.patch
        16 kB
        Jeff Zhang
      9. hbase-0.20.0-test.jar
        1.83 MB
        Jeff Zhang
      10. hbase-0.20.0.jar
        1.46 MB
        Jeff Zhang
      11. build.xml.path
        1 kB
        Vincent BARAT

        Activity

        Vincent BARAT created issue -
        Hide
        Vincent BARAT added a comment -

        This small patch ports the current HBase 0.18.0 related code source to HBase 0.20.0. This patch works for me. It applies on the trunk of PIG as of revision 816619. Unfortunately, I was unable to make the TestHBaseStorage unit test work correctly, but it seems to be a problem with my environment (classpath).
        Someone else will have to test it and complete the patch (it lacks the hbase-0.20.0 jar file).

        Show
        Vincent BARAT added a comment - This small patch ports the current HBase 0.18.0 related code source to HBase 0.20.0. This patch works for me. It applies on the trunk of PIG as of revision 816619. Unfortunately, I was unable to make the TestHBaseStorage unit test work correctly, but it seems to be a problem with my environment (classpath). Someone else will have to test it and complete the patch (it lacks the hbase-0.20.0 jar file).
        Vincent BARAT made changes -
        Field Original Value New Value
        Attachment pig-hbase-0.20.0-support.patch [ 12420127 ]
        Hide
        Alan Gates added a comment -

        In addition to adding hbase-0.20.0.jar to the lib directory did you add hbase-0.20.0-test?

        Show
        Alan Gates added a comment - In addition to adding hbase-0.20.0.jar to the lib directory did you add hbase-0.20.0-test?
        Hide
        Vincent BARAT added a comment -

        Yes, but I was unable to make the TestHBaseStorage work. I guess it was just a matter of environement, since the errors were related to a classes not found.
        I didn't waste too much time on that actually...
        I will try again.

        Show
        Vincent BARAT added a comment - Yes, but I was unable to make the TestHBaseStorage work. I guess it was just a matter of environement, since the errors were related to a classes not found. I didn't waste too much time on that actually... I will try again.
        Hide
        Vincent BARAT added a comment -

        To show you better what I did on the jar files side, here is the patch I made on the build.xml file.

        Show
        Vincent BARAT added a comment - To show you better what I did on the jar files side, here is the patch I made on the build.xml file.
        Vincent BARAT made changes -
        Attachment build.xml.path [ 12420599 ]
        Hide
        Alan Gates added a comment -

        The issue was the missing Zookeeper lib. I added that, and now I get what looks like a real hbase error. I have no idea what it means, so I'll let you take a look. I've attached both a new patch (with the changes to build.xml to pick up the right libs) and the error log from the test run.

        Show
        Alan Gates added a comment - The issue was the missing Zookeeper lib. I added that, and now I get what looks like a real hbase error. I have no idea what it means, so I'll let you take a look. I've attached both a new patch (with the changes to build.xml to pick up the right libs) and the error log from the test run.
        Alan Gates made changes -
        Attachment pig-hbase-20-v2.patch [ 12420600 ]
        Attachment TEST-org.apache.pig.test.TestHBaseStorage.txt [ 12420601 ]
        Hide
        Vincent BARAT added a comment -

        Thanks to your patch, I succeeded in running the TestBaseStorage test (see attached log).
        I do not reproduce your error, which may come from your enthronement.

        Show
        Vincent BARAT added a comment - Thanks to your patch, I succeeded in running the TestBaseStorage test (see attached log). I do not reproduce your error, which may come from your enthronement.
        Vincent BARAT made changes -
        Hide
        Jeff Zhang added a comment -

        Any progress on this issue ?
        It looks like it's still opened

        Show
        Jeff Zhang added a comment - Any progress on this issue ? It looks like it's still opened
        Hide
        Alan Gates added a comment -

        I haven't been able to get the unit test to pass in my environment.

        Show
        Alan Gates added a comment - I haven't been able to get the unit test to pass in my environment.
        Jeff Zhang made changes -
        Attachment Pig_HBase_0.20.0.patch [ 12423801 ]
        Jeff Zhang made changes -
        Attachment hbase-0.20.0.jar [ 12423802 ]
        Attachment hbase-0.18.1-test.jar [ 12423803 ]
        Jeff Zhang made changes -
        Attachment zookeeper-hbase-1329.jar [ 12423804 ]
        Jeff Zhang made changes -
        Attachment hbase-0.18.1-test.jar [ 12423803 ]
        Jeff Zhang made changes -
        Attachment hbase-0.20.0-test.jar [ 12423805 ]
        Jeff Zhang made changes -
        Assignee Jeff Zhang [ zjffdu ]
        Jeff Zhang made changes -
        Attachment hbase-0.20.0-test.jar [ 12423805 ]
        Jeff Zhang made changes -
        Attachment hbase-0.20.0.jar [ 12423802 ]
        Jeff Zhang made changes -
        Attachment Pig_HBase_0.20.0.patch [ 12423801 ]
        Jeff Zhang made changes -
        Attachment zookeeper-hbase-1329.jar [ 12423804 ]
        Jeff Zhang made changes -
        Attachment Pig_HBase_0.20.0.patch [ 12423807 ]
        Attachment hbase-0.20.0.jar [ 12423808 ]
        Attachment hbase-0.20.0-test.jar [ 12423809 ]
        Jeff Zhang made changes -
        Attachment zookeeper-hbase-1329.jar [ 12423811 ]
        Hide
        Jeff Zhang added a comment -

        Vincent, I do not know how you pass TestHBaseStorage using your patch. Because hbase 0.20 integrate zookeeper , so TestHBaseStorage has to be updated accordingly.

        I submit the patch including the source code and jars. (one tricky thing is that MiniZookeeperCluster's client port is 21810 which is hard coded in source code level, while the default zookeeper's port is 2181. so I attach hbase-site.xml to override the client port of zookeeper to make it the same as MiniZookeeperCluster)

        Show
        Jeff Zhang added a comment - Vincent, I do not know how you pass TestHBaseStorage using your patch. Because hbase 0.20 integrate zookeeper , so TestHBaseStorage has to be updated accordingly. I submit the patch including the source code and jars. (one tricky thing is that MiniZookeeperCluster's client port is 21810 which is hard coded in source code level, while the default zookeeper's port is 2181. so I attach hbase-site.xml to override the client port of zookeeper to make it the same as MiniZookeeperCluster)
        Jeff Zhang made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 0.5.0 [ 12314213 ]
        Tags hbase
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12423811/zookeeper-hbase-1329.jar
        against trunk revision 831481.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 92 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/136/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423811/zookeeper-hbase-1329.jar against trunk revision 831481. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/136/console This message is automatically generated.
        Hide
        Jeff Zhang added a comment -

        this patch works on my machine, but it seems that I have no right to put the jars into pig trunk, so anyone could help validate the patch on pig trunk ?

        Thank you in advance.

        Show
        Jeff Zhang added a comment - this patch works on my machine, but it seems that I have no right to put the jars into pig trunk, so anyone could help validate the patch on pig trunk ? Thank you in advance.
        Hide
        Alan Gates added a comment -

        Patch doesn't include binary files. I'll pull together the latest patch plus the jars and test it.

        Show
        Alan Gates added a comment - Patch doesn't include binary files. I'll pull together the latest patch plus the jars and test it.
        Hide
        Alan Gates added a comment -

        When I run TestHBaseStorage now I get:

        Testcase: testLoadFromHBase took 592.908 sec
        Caused an ERROR
        Unable to open iterator for alias a
        org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a
        at org.apache.pig.PigServer.openIterator(PigServer.java:481)
        at org.apache.pig.test.TestHBaseStorage.testLoadFromHBase(TestHBaseStorage.java:170)
        Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6015: During execution, encountered a Hadoop error.
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:922)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:573)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:555)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:686)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:582)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:555)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:686)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:586)
        at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:549)
        at .apache.hadoop.hbase.client.HTable.<init>(HTable.java:125)
        at .apache.pig.backend.hadoop.hbase.HBaseSlice.init(HBaseSlice.java:159)
        at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper.makeReader(SliceWrapper.java:129)
        at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getRecordReader(PigInputFormat.java:258)
        at .apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
        at .apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region

        Let me know if you'd like to see the whole log.

        Show
        Alan Gates added a comment - When I run TestHBaseStorage now I get: Testcase: testLoadFromHBase took 592.908 sec Caused an ERROR Unable to open iterator for alias a org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a at org.apache.pig.PigServer.openIterator(PigServer.java:481) at org.apache.pig.test.TestHBaseStorage.testLoadFromHBase(TestHBaseStorage.java:170) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6015: During execution, encountered a Hadoop error. at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:922) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:573) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:555) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:686) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:582) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:555) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:686) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:586) at .apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:549) at .apache.hadoop.hbase.client.HTable.<init>(HTable.java:125) at .apache.pig.backend.hadoop.hbase.HBaseSlice.init(HBaseSlice.java:159) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper.makeReader(SliceWrapper.java:129) at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getRecordReader(PigInputFormat.java:258) at .apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338) at .apache.hadoop.mapred.MapTask.run(MapTask.java:307) Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying to locate root region Let me know if you'd like to see the whole log.
        Hide
        Jeff Zhang added a comment -

        yes, Alan, Could you attach the whole log including the logs of task tracker
        Thank you.

        Show
        Jeff Zhang added a comment - yes, Alan, Could you attach the whole log including the logs of task tracker Thank you.
        Hide
        Alan Gates added a comment -

        Test run results plus logs.

        Show
        Alan Gates added a comment - Test run results plus logs.
        Alan Gates made changes -
        Attachment TEST-org.apache.pig.test.TestHBaseStorage.txt [ 12423976 ]
        Attachment test-output.tgz [ 12423977 ]
        Hide
        Jeff Zhang added a comment -

        Alan, do you have file hbase-site.xml in folder test ? ( I put it in my patch)

        Because I look into the logs and find that the map task is attempting to connect to zookeeper at port 2181, but the the port of MiniZookeeperCluster is 21810. So there should be a file hbase-site.xml in folder test to override the configuration just like they did in hbase trunk.

        Show
        Jeff Zhang added a comment - Alan, do you have file hbase-site.xml in folder test ? ( I put it in my patch) Because I look into the logs and find that the map task is attempting to connect to zookeeper at port 2181, but the the port of MiniZookeeperCluster is 21810. So there should be a file hbase-site.xml in folder test to override the configuration just like they did in hbase trunk.
        Hide
        Alan Gates added a comment -

        Yes, it's there.

        Show
        Alan Gates added a comment - Yes, it's there.
        Hide
        Jeff Zhang added a comment -

        Well, it's weird.

        Alan, could check again that the pig-0.6.0-dev-withouthadoop.jar have file hbase-site.xml, and in this file hbase.zookeeper.property.clientPort is set to 21810 ?

        Show
        Jeff Zhang added a comment - Well, it's weird. Alan, could check again that the pig-0.6.0-dev-withouthadoop.jar have file hbase-site.xml, and in this file hbase.zookeeper.property.clientPort is set to 21810 ?
        Jeff Zhang made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Alan Gates added a comment -

        afterside:~/src/pig/PIG-970-3/trunk> jar tf pig-withouthadoop.jar | grep hbase
        org/apache/pig/backend/hadoop/hbase/
        org/apache/pig/backend/hadoop/hbase/HBaseSlice.class
        org/apache/pig/backend/hadoop/hbase/HBaseStorage.class

        Show
        Alan Gates added a comment - afterside:~/src/pig/ PIG-970 -3/trunk> jar tf pig-withouthadoop.jar | grep hbase org/apache/pig/backend/hadoop/hbase/ org/apache/pig/backend/hadoop/hbase/HBaseSlice.class org/apache/pig/backend/hadoop/hbase/HBaseStorage.class
        Jeff Zhang made changes -
        Attachment Pig_HBase_0.20.0.patch [ 12423807 ]
        Hide
        Jeff Zhang added a comment -

        Alan, I find the problem. Before in eclipse I put the output folder to build/classes which is conflict with the output folder in build.xml. So it hides the problem.

        Now I add one line in build.xml:

        <copy file="${basedir}/test/hbase-site.xml" tofile="${test.build.classes}/hbase-site.xml"/> 

        So that the test case code can find hbase-site.xml in classpath.

        Show
        Jeff Zhang added a comment - Alan, I find the problem. Before in eclipse I put the output folder to build/classes which is conflict with the output folder in build.xml. So it hides the problem. Now I add one line in build.xml: <copy file= "${basedir}/test/hbase-site.xml" tofile= "${test.build.classes}/hbase-site.xml" /> So that the test case code can find hbase-site.xml in classpath.
        Jeff Zhang made changes -
        Attachment Pig_HBase_0.20.0.patch [ 12423999 ]
        Hide
        Alan Gates added a comment -

        I have checked in the patch. Thank you Jeff and Vincent for you patience and persistence in getting this in.

        I propose that we also check this into the 0.5 branch since that branch is on 0.20, and currently hbase integration is broken in 0.5.

        Show
        Alan Gates added a comment - I have checked in the patch. Thank you Jeff and Vincent for you patience and persistence in getting this in. I propose that we also check this into the 0.5 branch since that branch is on 0.20, and currently hbase integration is broken in 0.5.
        Alan Gates made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.6.0 [ 12314214 ]
        Fix Version/s 0.5.0 [ 12314213 ]
        Resolution Fixed [ 1 ]
        Hide
        Dmitriy V. Ryaboy added a comment -

        Didn't 0.5 go out already? Are you thinking we will need 0.5.1 before 0.6 comes out?

        Show
        Dmitriy V. Ryaboy added a comment - Didn't 0.5 go out already? Are you thinking we will need 0.5.1 before 0.6 comes out?
        Alan Gates made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        43d 15h 43m 1 Jeff Zhang 02/Nov/09 08:34
        Patch Available Patch Available Open Open
        1d 17h 15m 1 Jeff Zhang 04/Nov/09 01:50
        Open Open Resolved Resolved
        1d 16h 6m 1 Alan Gates 05/Nov/09 17:57
        Resolved Resolved Closed Closed
        139d 4h 18m 1 Alan Gates 24/Mar/10 22:15

          People

          • Assignee:
            Jeff Zhang
            Reporter:
            Vincent BARAT
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development