Flume
  1. Flume
  2. FLUME-1906

Ability to disable WAL for put operation in HBaseSink

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.3.1
    • Fix Version/s: v1.4.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      HBase supports setWriteToWAL(boolean) in Put to write edits to WAL or not. User can specify the disable option in HBase sink configurations.

      1. FLUME-1906.patch
        5 kB
        Hari Shreedharan
      2. FLUME-1906-1.patch
        5 kB
        Hari Shreedharan

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          7d 14h 41m 1 Hari Shreedharan 14/Feb/13 21:33
          Patch Available Patch Available Resolved Resolved
          10h 29m 1 Mubarak Seyed 15/Feb/13 08:02
          Mike Percy made changes -
          Remote Link This issue links to "Review Board (Web Link)" [ 12052 ]
          Mike Percy made changes -
          Comment [ Adding review board link for posterity. fyi, the +1 was on review board. ]
          Mike Percy made changes -
          Remote Link This issue links to "Review Board (Web Link)" [ 12052 ]
          Hide
          Hari Shreedharan added a comment -

          Adding the review link for posterity.

          Show
          Hari Shreedharan added a comment - Adding the review link for posterity.
          Hari Shreedharan made changes -
          Remote Link This issue links to "Review (Web Link)" [ 12051 ]
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #362 (See https://builds.apache.org/job/flume-trunk/362/)
          FLUME-1906 Ability to disable WAL for put operation in HBaseSink (Revision 510f63ba39592e7912a85c35effb8be52699057a)

          Result = SUCCESS
          mubarak : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=510f63ba39592e7912a85c35effb8be52699057a
          Files :

          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java
          • flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java
          Show
          Hudson added a comment - Integrated in flume-trunk #362 (See https://builds.apache.org/job/flume-trunk/362/ ) FLUME-1906 Ability to disable WAL for put operation in HBaseSink (Revision 510f63ba39592e7912a85c35effb8be52699057a) Result = SUCCESS mubarak : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=510f63ba39592e7912a85c35effb8be52699057a Files : flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSink.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java
          Hari Shreedharan made changes -
          Release Note Thanks for the patch Hari! Pushed to trunk & flume-1.4 branch.
          Mubarak Seyed made changes -
          Resolution Fixed [ 1 ]
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Release Note Thanks for the patch Hari! Pushed to trunk & flume-1.4 branch.
          Hide
          Mubarak Seyed added a comment -

          Thanks for the patch Hari! Pushed to trunk & flume-1.4 branch.

          Show
          Mubarak Seyed added a comment - Thanks for the patch Hari! Pushed to trunk & flume-1.4 branch.
          Mubarak Seyed made changes -
          Fix Version/s v1.4.0 [ 12323372 ]
          Hari Shreedharan made changes -
          Attachment FLUME-1906-1.patch [ 12569401 ]
          Hari Shreedharan made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Assignee Hari Shreedharan [ hshreedharan ]
          Hari Shreedharan made changes -
          Field Original Value New Value
          Attachment FLUME-1906.patch [ 12569393 ]
          Hide
          Mubarak Seyed added a comment -

          I am just a bit concerned about completing the transaction on the sink side without the data being persisted. This could potentially cause debugging issues.

          I think it is a trade-off by giving up durability (for yet-to-flush in-memory data) to save disk space in HDFS. There are some instances where ops runs hbase clusters with WAL turned off to save 3 copies of WAL data in HDFS. if config changes for WAL then we can add a INFO in log for easy debugging.

          Show
          Mubarak Seyed added a comment - I am just a bit concerned about completing the transaction on the sink side without the data being persisted. This could potentially cause debugging issues. I think it is a trade-off by giving up durability (for yet-to-flush in-memory data) to save disk space in HDFS. There are some instances where ops runs hbase clusters with WAL turned off to save 3 copies of WAL data in HDFS. if config changes for WAL then we can add a INFO in log for easy debugging.
          Hide
          Hari Shreedharan added a comment -

          The flume agent does not need to restart. If the config file is updated, flume will pick it up automatically and reload all components. I am just a bit concerned about completing the transaction on the sink side without the data being persisted. This could potentially cause debugging issues.

          Show
          Hari Shreedharan added a comment - The flume agent does not need to restart. If the config file is updated, flume will pick it up automatically and reload all components. I am just a bit concerned about completing the transaction on the sink side without the data being persisted. This could potentially cause debugging issues.
          Hide
          Mubarak Seyed added a comment -

          Doing this would void Flume's delivery guarantee no? (not exactly Flume's guarantee, but still the data would not be persisted to disk).

          If Region-server crashes then whatever data in memstore will be lost (provided region flush size is tuned appropriately). This option helps hbase ops to dynamically enable/diable WAL to save disk space in HDFS (i think flume agent needs to be restarted to see the updated hbase sink config?).

          Show
          Mubarak Seyed added a comment - Doing this would void Flume's delivery guarantee no? (not exactly Flume's guarantee, but still the data would not be persisted to disk). If Region-server crashes then whatever data in memstore will be lost (provided region flush size is tuned appropriately). This option helps hbase ops to dynamically enable/diable WAL to save disk space in HDFS (i think flume agent needs to be restarted to see the updated hbase sink config?).
          Hide
          Hari Shreedharan added a comment -

          Doing this would void Flume's delivery guarantee no? (not exactly Flume's guarantee, but still the data would not be persisted to disk).

          Show
          Hari Shreedharan added a comment - Doing this would void Flume's delivery guarantee no? (not exactly Flume's guarantee, but still the data would not be persisted to disk).
          Mubarak Seyed created issue -

            People

            • Assignee:
              Hari Shreedharan
              Reporter:
              Mubarak Seyed
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development