Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.0, 1.4.0
    • Fix Version/s: 1.4.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      Bump AsyncHBase to 1.3.2 (versus 1.2.0) to take advantage of a range of bug fixes and performance improvements.

      Only affects the asynchronous HBase sink.

      1. FLUME-1688.patch
        0.4 kB
        David Johnston
      2. FLUME-1688.patch
        0.4 kB
        Hari Shreedharan

        Issue Links

          Activity

          Hide
          hudson Hudson added a comment -

          Integrated in flume-trunk #407 (See https://builds.apache.org/job/flume-trunk/407/)
          FLUME-1688. Bump AsyncHBase version to 1.4.1. (Revision c2f559482f60cea4899dc0dc57a88c7087755823)

          Result = FAILURE
          mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=c2f559482f60cea4899dc0dc57a88c7087755823
          Files :

          • pom.xml
          Show
          hudson Hudson added a comment - Integrated in flume-trunk #407 (See https://builds.apache.org/job/flume-trunk/407/ ) FLUME-1688 . Bump AsyncHBase version to 1.4.1. (Revision c2f559482f60cea4899dc0dc57a88c7087755823) Result = FAILURE mpercy : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=c2f559482f60cea4899dc0dc57a88c7087755823 Files : pom.xml
          Hide
          mpercy Mike Percy added a comment -

          Pushed to trunk & flume-1.4 branches.

          Thanks for the patch Hari!

          Show
          mpercy Mike Percy added a comment - Pushed to trunk & flume-1.4 branches. Thanks for the patch Hari!
          Hide
          mpercy Mike Percy added a comment -

          +1 lgtm

          Show
          mpercy Mike Percy added a comment - +1 lgtm
          Hide
          hshreedharan Hari Shreedharan added a comment -

          Trivial patch, skipping rb. Ran full build with both profiles - works fine.

          Show
          hshreedharan Hari Shreedharan added a comment - Trivial patch, skipping rb. Ran full build with both profiles - works fine.
          Hide
          hshreedharan Hari Shreedharan added a comment -

          Yes, it would - since Asynchbase-1.2.0 did have open versioning for zk, which this version gets rid of. I am assigning this to myself since David has not responded for a couple days.

          Show
          hshreedharan Hari Shreedharan added a comment - Yes, it would - since Asynchbase-1.2.0 did have open versioning for zk, which this version gets rid of. I am assigning this to myself since David has not responded for a couple days.
          Hide
          ejsarge Edward Sargisson added a comment -

          Going to AsyncHBase 1.4.0 may resolve FLUME-2028 as well.

          Show
          ejsarge Edward Sargisson added a comment - Going to AsyncHBase 1.4.0 may resolve FLUME-2028 as well.
          Hide
          hshreedharan Hari Shreedharan added a comment -

          [~cdcttr], do you want to still look into this one? If yes, please update to Asynchbase 1.4.1

          Show
          hshreedharan Hari Shreedharan added a comment - [~cdcttr] , do you want to still look into this one? If yes, please update to Asynchbase 1.4.1
          Hide
          hshreedharan Hari Shreedharan added a comment -

          OK, looks like what was happening was that asynchbase went from using dependency ranges to specific dependency versions. So I think at this point we can upgrade to the latest version.

          Show
          hshreedharan Hari Shreedharan added a comment - OK, looks like what was happening was that asynchbase went from using dependency ranges to specific dependency versions. So I think at this point we can upgrade to the latest version.
          Hide
          hshreedharan Hari Shreedharan added a comment -

          We can bump it up to 1.4.0. I think at this point, we can just ignore the ZK dependencies.

          Show
          hshreedharan Hari Shreedharan added a comment - We can bump it up to 1.4.0. I think at this point, we can just ignore the ZK dependencies.
          Hide
          hshreedharan Hari Shreedharan added a comment -

          David Johnston Do you have any updates on this? It might be a good idea to open an issue with asynchbase project to look into this. You should post the mvn dependency:tree output which I have posted above.

          Show
          hshreedharan Hari Shreedharan added a comment - David Johnston Do you have any updates on this? It might be a good idea to open an issue with asynchbase project to look into this. You should post the mvn dependency:tree output which I have posted above.
          Hide
          ftdave David Johnston added a comment -

          Ah, the AsyncHBase POM explicitly defines a minimum version. Still doesn't explain why it pulled in 3.4.4 before and 3.3.4 now.

          Show
          ftdave David Johnston added a comment - Ah, the AsyncHBase POM explicitly defines a minimum version. Still doesn't explain why it pulled in 3.4.4 before and 3.3.4 now.
          Hide
          ftdave David Johnston added a comment -

          ZOOKEEPER-1437 is indeed the same issue we had. Regretfully I hadn't spotted the change from 3.4.4 to 3.3.4 (curse those similar numbers!)

          Looking deeper, AsyncHBase has always had a dependency on ZooKeeper 3.3.1 (in 1.2.0) or 3.3.4 (1.3.0 and higher) - see https://github.com/OpenTSDB/asynchbase/blob/master/third_party/zookeeper/include.mk - which is making me wonder why 3.4.4 is being pulled in at all.

          Show
          ftdave David Johnston added a comment - ZOOKEEPER-1437 is indeed the same issue we had. Regretfully I hadn't spotted the change from 3.4.4 to 3.3.4 (curse those similar numbers!) Looking deeper, AsyncHBase has always had a dependency on ZooKeeper 3.3.1 (in 1.2.0) or 3.3.4 (1.3.0 and higher) - see https://github.com/OpenTSDB/asynchbase/blob/master/third_party/zookeeper/include.mk - which is making me wonder why 3.4.4 is being pulled in at all.
          Hide
          hshreedharan Hari Shreedharan added a comment - - edited

          Outputs of mvn dependency:tree. Do you know why this is happening:

          With asynchbase-1.3.2:

          [INFO] +- org.apache.flume.flume-ng-sinks:flume-ng-hbase-sink:jar:1.4.0-SNAPSHOT:compile
          [INFO] |  \- org.hbase:asynchbase:jar:1.3.2:compile
          [INFO] |     +- com.stumbleupon:async:jar:1.2.0:compile
          [INFO] |     +- org.apache.zookeeper:zookeeper:jar:3.3.4:compile
          [INFO] |     \- org.slf4j:jcl-over-slf4j:jar:1.6.4:runtime
          

          With aynchbase-1.2.0 (current trunk):

          [INFO] +- org.apache.flume.flume-ng-sinks:flume-ng-hbase-sink:jar:1.4.0-SNAPSHOT:compile
          [INFO] |  \- org.hbase:asynchbase:jar:1.2.0:compile
          [INFO] |     +- org.jboss.netty:netty:jar:3.2.7.Final:compile
          [INFO] |     +- com.stumbleupon:async:jar:1.2.0:compile
          [INFO] |     +- org.apache.zookeeper:zookeeper:jar:3.4.4:compile
          [INFO] |     \- org.slf4j:jcl-over-slf4j:jar:1.7.2:runtime
          
          Show
          hshreedharan Hari Shreedharan added a comment - - edited Outputs of mvn dependency:tree. Do you know why this is happening: With asynchbase-1.3.2: [INFO] +- org.apache.flume.flume-ng-sinks:flume-ng-hbase-sink:jar:1.4.0-SNAPSHOT:compile [INFO] | \- org.hbase:asynchbase:jar:1.3.2:compile [INFO] | +- com.stumbleupon:async:jar:1.2.0:compile [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.3.4:compile [INFO] | \- org.slf4j:jcl-over-slf4j:jar:1.6.4:runtime With aynchbase-1.2.0 (current trunk): [INFO] +- org.apache.flume.flume-ng-sinks:flume-ng-hbase-sink:jar:1.4.0-SNAPSHOT:compile [INFO] | \- org.hbase:asynchbase:jar:1.2.0:compile [INFO] | +- org.jboss.netty:netty:jar:3.2.7.Final:compile [INFO] | +- com.stumbleupon:async:jar:1.2.0:compile [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.4.4:compile [INFO] | \- org.slf4j:jcl-over-slf4j:jar:1.7.2:runtime
          Hide
          hshreedharan Hari Shreedharan added a comment -

          It looks like ZK is being downgraded from 3.4.4 to 3.3.4. I am not sure why this is and whether this affects HBase. I will need to look at hbase and see if this change in ZK version is a concern.

          Show
          hshreedharan Hari Shreedharan added a comment - It looks like ZK is being downgraded from 3.4.4 to 3.3.4. I am not sure why this is and whether this affects HBase. I will need to look at hbase and see if this change in ZK version is a concern.
          Hide
          hshreedharan Hari Shreedharan added a comment -

          The ZK connection issues might have been due to ZOOKEEPER-1437. I am concerned about the dependencies it introduces. Looks like it will pull in new version of Netty, Guava etc. Let me take a look at the dependencies it is pulling in by doing a build.

          Show
          hshreedharan Hari Shreedharan added a comment - The ZK connection issues might have been due to ZOOKEEPER-1437 . I am concerned about the dependencies it introduces. Looks like it will pull in new version of Netty, Guava etc. Let me take a look at the dependencies it is pulling in by doing a build.
          Hide
          ftdave David Johnston added a comment -

          The most significant performance improvement was added in 1.3.0, allowing a Put to contain multiple values (https://github.com/OpenTSDB/asynchbase/commit/bb9168cda1551db4712c1c584cd140a66ec17422) - in 1.2.0, the same operation would require many separate Puts. It also lets us use coprocessors on the region server, which for us at least is a functional requirement.

          WRT bugs, we've experienced frequent ZooKeeper connection issues when using Flume 1.3.0 and Hadoop 1.0.3. Recompiling Flume with AsyncHBase 1.3.2 fixed this issue, although I've yet to identify exactly what changeset fixed this (I'm assuming it's one of the Netty changes).

          Show
          ftdave David Johnston added a comment - The most significant performance improvement was added in 1.3.0, allowing a Put to contain multiple values ( https://github.com/OpenTSDB/asynchbase/commit/bb9168cda1551db4712c1c584cd140a66ec17422 ) - in 1.2.0, the same operation would require many separate Puts. It also lets us use coprocessors on the region server, which for us at least is a functional requirement. WRT bugs, we've experienced frequent ZooKeeper connection issues when using Flume 1.3.0 and Hadoop 1.0.3. Recompiling Flume with AsyncHBase 1.3.2 fixed this issue, although I've yet to identify exactly what changeset fixed this (I'm assuming it's one of the Netty changes).
          Hide
          hshreedharan Hari Shreedharan added a comment -

          Do you have information on:
          a) compatibility issues.
          b) performance?

          Show
          hshreedharan Hari Shreedharan added a comment - Do you have information on: a) compatibility issues. b) performance?

            People

            • Assignee:
              hshreedharan Hari Shreedharan
              Reporter:
              ftdave David Johnston
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 5m
                5m
                Remaining:
                Remaining Estimate - 5m
                5m
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development