HBase
  1. HBase
  2. HBASE-1944

Add a "deferred log flush" attribute to HTD

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      At SU we have a case where we'd like to be able to, by default, sync every appended edit but for some tables have the control to disable their sync. You could then rely on other tables like .META. to do syncs, the optional log syncer timer down in LogSyncer and user tables that are always synced. For example, set a 10ms timer on the log syncer so that in the worse case you lose 10ms of non-catalog edits.

      1. HBASE-1944-v4.patch
        6 kB
        Jean-Daniel Cryans
      2. HBASE-1944-v3.patch
        6 kB
        Jean-Daniel Cryans
      3. HBASE-1944-v2.patch
        6 kB
        Jean-Daniel Cryans
      4. HBASE-1944.patch
        6 kB
        Jean-Daniel Cryans

        Activity

        Hide
        Jean-Daniel Cryans added a comment -

        This patch adds the new HTD.IS_SYNCED attribute and in HRS I refactored the code that handles the syncs in a method that also checks if the table has to be synced.

        Show
        Jean-Daniel Cryans added a comment - This patch adds the new HTD.IS_SYNCED attribute and in HRS I refactored the code that handles the syncs in a method that also checks if the table has to be synced.
        Hide
        ryan rawson added a comment -

        this is absolutely necessary to achieve high performance and data safety under the presence of hdfs-265, since the best case scenario will be about 1000 fsync/second.

        I am not super fond of the name, since synced suggests perhaps that the data isnt written to disk or something, which is NOT true.

        Perhaps PER_TX_FLUSH

        Show
        ryan rawson added a comment - this is absolutely necessary to achieve high performance and data safety under the presence of hdfs-265, since the best case scenario will be about 1000 fsync/second. I am not super fond of the name, since synced suggests perhaps that the data isnt written to disk or something, which is NOT true. Perhaps PER_TX_FLUSH
        Hide
        Jean-Daniel Cryans added a comment -

        Yeah IS_SYNCED seems to imply that you don't write in the WAL but you do it's just that you don't make sure it's synced. Current makes sense for a dev but probably not for user.

        So PER_TX_SYNC? DELAYED_SYNC? DEFERRED_SYNC?

        Show
        Jean-Daniel Cryans added a comment - Yeah IS_SYNCED seems to imply that you don't write in the WAL but you do it's just that you don't make sure it's synced. Current makes sense for a dev but probably not for user. So PER_TX_SYNC? DELAYED_SYNC? DEFERRED_SYNC?
        Hide
        Jean-Daniel Cryans added a comment -

        I just tested this patch on a cluster, I get 1300 incrCol/sec with a single threaded client on a table that doesn't sync. I killed -9 the RS a couple of times and even with optional sync every 100ms I lose only 1-2 edits.

        Show
        Jean-Daniel Cryans added a comment - I just tested this patch on a cluster, I get 1300 incrCol/sec with a single threaded client on a table that doesn't sync. I killed -9 the RS a couple of times and even with optional sync every 100ms I lose only 1-2 edits.
        Hide
        stack added a comment -

        Yeah, any name is better than IS_SYNC (ain't that the name of a boyband?).... And yeah, it should have FLUSH rather than SYNC since sync don't work yet?

        Show
        stack added a comment - Yeah, any name is better than IS_SYNC (ain't that the name of a boyband?).... And yeah, it should have FLUSH rather than SYNC since sync don't work yet?
        Hide
        stack added a comment -

        Otherwise the patch is good.

        Show
        stack added a comment - Otherwise the patch is good.
        Hide
        Jean-Daniel Cryans added a comment -

        It's nsync Stack! I guess you're too old

        Sorry I confused sync and flush, i guess that variable isn't volatile in my mind and I still see an old value.

        Show
        Jean-Daniel Cryans added a comment - It's nsync Stack! I guess you're too old Sorry I confused sync and flush, i guess that variable isn't volatile in my mind and I still see an old value.
        Hide
        Jean-Daniel Cryans added a comment -

        New patch with new name, now it's called DEFERRED_LOG_FLUSH.

        Show
        Jean-Daniel Cryans added a comment - New patch with new name, now it's called DEFERRED_LOG_FLUSH.
        Hide
        Jean-Daniel Cryans added a comment -

        Ryan told me that the name may be good but it should be false by default then and the check in HRS should change.

        Show
        Jean-Daniel Cryans added a comment - Ryan told me that the name may be good but it should be false by default then and the check in HRS should change.
        Hide
        Jean-Daniel Cryans added a comment -

        Fixed logic.

        Show
        Jean-Daniel Cryans added a comment - Fixed logic.
        Hide
        Jean-Daniel Cryans added a comment -

        Better yet, some refactoring to get rid of checking region's null.

        Show
        Jean-Daniel Cryans added a comment - Better yet, some refactoring to get rid of checking region's null.
        Hide
        ryan rawson added a comment -

        looking good, +1

        Show
        ryan rawson added a comment - looking good, +1
        Hide
        stack added a comment -

        Patch looks good. Should cache the boolean though rather than look it up each time; those cost if you look in profiler.

        Show
        stack added a comment - Patch looks good. Should cache the boolean though rather than look it up each time; those cost if you look in profiler.
        Hide
        Jean-Daniel Cryans added a comment -

        Committed with Stack's suggestion.

        Show
        Jean-Daniel Cryans added a comment - Committed with Stack's suggestion.

          People

          • Assignee:
            Jean-Daniel Cryans
            Reporter:
            Jean-Daniel Cryans
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development