HBase
  1. HBase
  2. HBASE-1780

HTable.flushCommits clears write buffer in finally clause

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.0, 0.90.0
    • Component/s: Client
    • Labels:
      None
    • Environment:

      All

    • Hadoop Flags:
      Reviewed

      Description

      Metthod flushCommits clears the write buffer in a finally clause.

      When using the write buffer, if the call to processBatchOfRows done in flushCommits throws an exception, the write buffer will be cleared thus potentially leading to loss of data on the client side.

      1. HBASE-1780.patch
        4 kB
        Andrew Purtell
      2. HBASE-1780.patch
        3 kB
        Andrew Purtell

        Activity

        Hide
        Andrew Purtell added a comment -

        Testing patch now.

        Show
        Andrew Purtell added a comment - Testing patch now.
        Hide
        Mathias Herberts added a comment -

        I think we should specify somewhere that HTable instances are not safe to use in MT environments for writes.

        The manipulation of writebuffer could could lead to ConcurrentModificationException being raised if Puts are added to it while the new write buffer size is computed in flushCommits.

        Show
        Mathias Herberts added a comment - I think we should specify somewhere that HTable instances are not safe to use in MT environments for writes. The manipulation of writebuffer could could lead to ConcurrentModificationException being raised if Puts are added to it while the new write buffer size is computed in flushCommits.
        Hide
        Andrew Purtell added a comment -

        Good point Mathias. Latest patch also synchronizes writeBuffer.

        Show
        Andrew Purtell added a comment - Good point Mathias. Latest patch also synchronizes writeBuffer.
        Hide
        stack added a comment -

        Do the put methods need to be synchronized now since internally its synchronizing on the write buffer? (Make currentWriteBufferSize transient so outside of the synch block its most recent value can be seen?)

        Minor: Make the below final and do assigning at same time as declaration?

          private ArrayList<Put> writeBuffer;
        

        Otherwise patch looks great. Patch is important fix.

        Now you've done the work, add a comment to HTable class javadoc to explicitly say its thread-safe?

        (There are a bunch of unused imports in my HTable. Are they in yours? Remove as part of this patch? They look bad, especially in this most public of our classes).

        Show
        stack added a comment - Do the put methods need to be synchronized now since internally its synchronizing on the write buffer? (Make currentWriteBufferSize transient so outside of the synch block its most recent value can be seen?) Minor: Make the below final and do assigning at same time as declaration? private ArrayList<Put> writeBuffer; Otherwise patch looks great. Patch is important fix. Now you've done the work, add a comment to HTable class javadoc to explicitly say its thread-safe? (There are a bunch of unused imports in my HTable. Are they in yours? Remove as part of this patch? They look bad, especially in this most public of our classes).
        Hide
        Andrew Purtell added a comment -

        @stack:

        There are a bunch of unused imports in my HTable. Are they in yours? Remove as part of this patch?

        If you're looking at Eclipse warnings, those are due to deprecated includes necessary for old client API, not unused imports.

        I'll fold your suggestions into the commit. Thanks for the review.

        Show
        Andrew Purtell added a comment - @stack: There are a bunch of unused imports in my HTable. Are they in yours? Remove as part of this patch? If you're looking at Eclipse warnings, those are due to deprecated includes necessary for old client API, not unused imports. I'll fold your suggestions into the commit. Thanks for the review.
        Hide
        stack added a comment -

        Bringing into 0.20.0

        @Andrew You are right. I'm looking at them on different computer. Usually deprecated differs in that its strikethrough. Not so in this case.

        Show
        stack added a comment - Bringing into 0.20.0 @Andrew You are right. I'm looking at them on different computer. Usually deprecated differs in that its strikethrough. Not so in this case.
        Hide
        stack added a comment -

        Ryan makes the valid point offline that HTable should NOT be shared by threads. If its all synchronized on the write, then no performance gain sharing the HTable instance. That'd mean, remove all synchronization, flag HTable as not thread-safe and just add the bit where writes are not lost on exception.

        Show
        stack added a comment - Ryan makes the valid point offline that HTable should NOT be shared by threads. If its all synchronized on the write, then no performance gain sharing the HTable instance. That'd mean, remove all synchronization, flag HTable as not thread-safe and just add the bit where writes are not lost on exception.
        Hide
        Andrew Purtell added a comment -

        Committed to branch and trunk with synchronization bits removed and a comment that HTable is not MT safe for writes.

        Show
        Andrew Purtell added a comment - Committed to branch and trunk with synchronization bits removed and a comment that HTable is not MT safe for writes.

          People

          • Assignee:
            Andrew Purtell
            Reporter:
            Mathias Herberts
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development