Flume
  1. Flume
  2. FLUME-2338

Support coalescing increments in HBaseSink

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.4.0
    • Fix Version/s: v1.5.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      Support coalescing increments in HBaseSink

      If a batch has multiple increments to the same row/column, these can be coalesced into a single HBase RPC call.

      1. FLUME-2338.patch
        32 kB
        Mike Percy
      2. FLUME-2338-2.patch
        34 kB
        Mike Percy
      3. FLUME-2338-3.patch
        35 kB
        Mike Percy

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          SUCCESS: Integrated in flume-trunk #560 (See https://builds.apache.org/job/flume-trunk/560/)
          Revert "FLUME-2338. Support coalescing increments in HBaseSink." (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo/?p=flume.git&a=commit&h=34836ce6cef7c0f0ae13b2e8d9a63ca838ce8ace)

          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
          • flume-ng-doc/sphinx/FlumeUserGuide.rst
          • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java
            FLUME-2338. Support coalescing increments in HBaseSink. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo/?p=flume.git&a=commit&h=1dfcb4b0edd4e2fca43d37bc01896015c245e589)
          • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java
          • flume-ng-doc/sphinx/FlumeUserGuide.rst
          • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java
          Show
          Hudson added a comment - SUCCESS: Integrated in flume-trunk #560 (See https://builds.apache.org/job/flume-trunk/560/ ) Revert " FLUME-2338 . Support coalescing increments in HBaseSink." (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo/?p=flume.git&a=commit&h=34836ce6cef7c0f0ae13b2e8d9a63ca838ce8ace ) flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java flume-ng-doc/sphinx/FlumeUserGuide.rst flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java FLUME-2338 . Support coalescing increments in HBaseSink. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo/?p=flume.git&a=commit&h=1dfcb4b0edd4e2fca43d37bc01896015c245e589 ) flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java flume-ng-doc/sphinx/FlumeUserGuide.rst flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java
          Hide
          ASF subversion and git services added a comment -

          Commit 5a06871c78102b87ca5b53993cc7b95e66292bad in flume's branch refs/heads/flume-1.5 from Hari Shreedharan
          [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=5a06871 ]

          FLUME-2338. Support coalescing increments in HBaseSink.

          (Mike Percy via Hari Shreedharan)

          Show
          ASF subversion and git services added a comment - Commit 5a06871c78102b87ca5b53993cc7b95e66292bad in flume's branch refs/heads/flume-1.5 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=5a06871 ] FLUME-2338 . Support coalescing increments in HBaseSink. (Mike Percy via Hari Shreedharan)
          Hide
          ASF subversion and git services added a comment -

          Commit ae022cf054280045de57f64bce3d63f910fa57fc in flume's branch refs/heads/flume-1.5 from Hari Shreedharan
          [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=ae022cf ]

          Revert "FLUME-2338. Support coalescing increments in HBaseSink."

          This reverts commit 674f4fcce2597e7e934ccc69eb04b426f5a9b8bb.

          Show
          ASF subversion and git services added a comment - Commit ae022cf054280045de57f64bce3d63f910fa57fc in flume's branch refs/heads/flume-1.5 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=ae022cf ] Revert " FLUME-2338 . Support coalescing increments in HBaseSink." This reverts commit 674f4fcce2597e7e934ccc69eb04b426f5a9b8bb.
          Hide
          ASF subversion and git services added a comment -

          Commit 1dfcb4b0edd4e2fca43d37bc01896015c245e589 in flume's branch refs/heads/trunk from Hari Shreedharan
          [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=1dfcb4b ]

          FLUME-2338. Support coalescing increments in HBaseSink.

          (Mike Percy via Hari Shreedharan)

          Show
          ASF subversion and git services added a comment - Commit 1dfcb4b0edd4e2fca43d37bc01896015c245e589 in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=1dfcb4b ] FLUME-2338 . Support coalescing increments in HBaseSink. (Mike Percy via Hari Shreedharan)
          Hide
          ASF subversion and git services added a comment -

          Commit 34836ce6cef7c0f0ae13b2e8d9a63ca838ce8ace in flume's branch refs/heads/trunk from Hari Shreedharan
          [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=34836ce ]

          Revert "FLUME-2338. Support coalescing increments in HBaseSink."

          This reverts commit 674f4fcce2597e7e934ccc69eb04b426f5a9b8bb.

          Show
          ASF subversion and git services added a comment - Commit 34836ce6cef7c0f0ae13b2e8d9a63ca838ce8ace in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=34836ce ] Revert " FLUME-2338 . Support coalescing increments in HBaseSink." This reverts commit 674f4fcce2597e7e934ccc69eb04b426f5a9b8bb.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in flume-trunk #559 (See https://builds.apache.org/job/flume-trunk/559/)
          FLUME-2338. Support coalescing increments in HBaseSink. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo/?p=flume.git&a=commit&h=674f4fcce2597e7e934ccc69eb04b426f5a9b8bb)

          • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java
          • flume-ng-doc/sphinx/FlumeUserGuide.rst
          • flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
          Show
          Hudson added a comment - SUCCESS: Integrated in flume-trunk #559 (See https://builds.apache.org/job/flume-trunk/559/ ) FLUME-2338 . Support coalescing increments in HBaseSink. (hshreedharan: http://git-wip-us.apache.org/repos/asf/flume/repo/?p=flume.git&a=commit&h=674f4fcce2597e7e934ccc69eb04b426f5a9b8bb ) flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestHBaseSink.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/BatchAware.java flume-ng-doc/sphinx/FlumeUserGuide.rst flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementHBaseSerializer.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
          Hide
          Hari Shreedharan added a comment -

          Looks like I committed the wrong patch (thanks to building on Linux and committing from Mac!). Will revert and re-commit.

          Show
          Hari Shreedharan added a comment - Looks like I committed the wrong patch (thanks to building on Linux and committing from Mac!). Will revert and re-commit.
          Hide
          Hari Shreedharan added a comment -

          Committed! Thanks Mike!

          Show
          Hari Shreedharan added a comment - Committed! Thanks Mike!
          Hide
          ASF subversion and git services added a comment -

          Commit cb3d84d354acc0cd7cac213c3a5190d74a190c12 in flume's branch refs/heads/flume-1.5 from Hari Shreedharan
          [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=cb3d84d ]

          FLUME-2338. Support coalescing increments in HBaseSink.

          (Mike Percy via Hari Shreedharan)

          Show
          ASF subversion and git services added a comment - Commit cb3d84d354acc0cd7cac213c3a5190d74a190c12 in flume's branch refs/heads/flume-1.5 from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=cb3d84d ] FLUME-2338 . Support coalescing increments in HBaseSink. (Mike Percy via Hari Shreedharan)
          Hide
          ASF subversion and git services added a comment -

          Commit 674f4fcce2597e7e934ccc69eb04b426f5a9b8bb in flume's branch refs/heads/trunk from Hari Shreedharan
          [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=674f4fc ]

          FLUME-2338. Support coalescing increments in HBaseSink.

          (Mike Percy via Hari Shreedharan)

          Show
          ASF subversion and git services added a comment - Commit 674f4fcce2597e7e934ccc69eb04b426f5a9b8bb in flume's branch refs/heads/trunk from Hari Shreedharan [ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=674f4fc ] FLUME-2338 . Support coalescing increments in HBaseSink. (Mike Percy via Hari Shreedharan)
          Hide
          Hari Shreedharan added a comment -

          +1. Built against HBase 94 and 96. Works!

          Show
          Hari Shreedharan added a comment - +1. Built against HBase 94 and 96. Works!
          Hide
          Mike Percy added a comment -

          Doh! Thanks Hari. This patch should work, I figured out how to test it against both versions of HBase and the unit tests pass.

          Show
          Mike Percy added a comment - Doh! Thanks Hari. This patch should work, I figured out how to test it against both versions of HBase and the unit tests pass.
          Hide
          Hari Shreedharan added a comment -

          Mike,

          Thanks for the update. Looks good, but there is a similar fix required in TestHBaseSink's CoalesceValidator#onAfterCoalesce, else it does not compile against HBase-96.

          Show
          Hari Shreedharan added a comment - Mike, Thanks for the update. Looks good, but there is a similar fix required in TestHBaseSink's CoalesceValidator#onAfterCoalesce, else it does not compile against HBase-96.
          Hide
          Mike Percy added a comment -

          Thanks Hari. How about this?

          I assume we can try to optimize the HBase 0.96 stuff later if needed.

          I am not sure how to run the tests against HBase 0.96 so hoping this will work.

          Show
          Mike Percy added a comment - Thanks Hari. How about this? I assume we can try to optimize the HBase 0.96 stuff later if needed. I am not sure how to run the tests against HBase 0.96 so hoping this will work.
          Hide
          Hari Shreedharan added a comment -

          Mike,

          This looks good. This unfortunately is not compatible with HBase 0.96.x. In the HBase 96 branch, the method to get the family map is:
          getFamilyMapOfLongs()
          (https://github.com/apache/hbase/blob/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Increment.java#L168 )

          We probably should use reflection to call the correct method, but the getFamilyMapOfLongs() looks like an expensive operation though. I am not sure if there is a way around it though.

          Show
          Hari Shreedharan added a comment - Mike, This looks good. This unfortunately is not compatible with HBase 0.96.x. In the HBase 96 branch, the method to get the family map is: getFamilyMapOfLongs() ( https://github.com/apache/hbase/blob/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Increment.java#L168 ) We probably should use reflection to call the correct method, but the getFamilyMapOfLongs() looks like an expensive operation though. I am not sure if there is a way around it though.
          Hide
          Mike Percy added a comment -

          Attaching patch. Also includes a BatchAware interface, allowing serializers that implement it to be aware of batching so that they can choose keys to optimize the effect of coalescing, if desired.

          Show
          Mike Percy added a comment - Attaching patch. Also includes a BatchAware interface, allowing serializers that implement it to be aware of batching so that they can choose keys to optimize the effect of coalescing, if desired.

            People

            • Assignee:
              Mike Percy
              Reporter:
              Mike Percy
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development