HBase
  1. HBase
  2. HBASE-3443

ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 0.92.0, 0.92.1
    • Fix Version/s: 0.94.0, 0.95.0
    • Component/s: regionserver
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      This is a correctness fix and will incur a 10-20% performance penalty for ICV and Increment operations. Other operations are not affected.
      Show
      This is a correctness fix and will incur a 10-20% performance penalty for ICV and Increment operations. Other operations are not affected.

      Description

      For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable.

      If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so.

      Sample test code outline:

      admin.createTable(desc)
      
      table = HTable.new(conf, tableName)
      
      table.incrementColumnValue(Bytes.toBytes("row"), cf1name, Bytes.toBytes("column"), 5);
      
      admin.flush(tableName)
      sleep(2)
      
      del = Delete.new(Bytes.toBytes("row"))
      table.delete(del)
      
      table.incrementColumnValue(Bytes.toBytes("row"), cf1name, Bytes.toBytes("column"), 5);
      
      get = Get.new(Bytes.toBytes("row"))
      keyValues = table.get(get).raw()
      keyValues.each do |keyValue|
        puts "Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}";
      end
      

      The above prints:

      Expect 5; Got Value=10
      
      1. 3443.txt
        6 kB
        Lars Hofhansl

        Issue Links

          Activity

          Hide
          Kannan Muthukkaruppan added a comment -

          Now that we have lazy seeks, i.e. HBASE-4465, we should be able to revert the work/optimization done HBASE-3082, and avoid this bug. What do you folks think?

          Show
          Kannan Muthukkaruppan added a comment - Now that we have lazy seeks, i.e. HBASE-4465 , we should be able to revert the work/optimization done HBASE-3082 , and avoid this bug. What do you folks think?
          Hide
          Ted Yu added a comment -

          +1 on the proposal.

          Show
          Ted Yu added a comment - +1 on the proposal.
          Hide
          Lars Hofhansl added a comment -

          I agree. I think delete handling is generally a bit "funky" in HBase (see also HBASE-4536).

          Show
          Lars Hofhansl added a comment - I agree. I think delete handling is generally a bit "funky" in HBase (see also HBASE-4536 ).
          Hide
          Benoit Sigoure added a comment -

          This seems like a pretty bad bug that causes data corruption. I ran into it this weekend. Which release do you guys think will have a fix for this issue?

          Show
          Benoit Sigoure added a comment - This seems like a pretty bad bug that causes data corruption. I ran into it this weekend. Which release do you guys think will have a fix for this issue?
          Hide
          stack added a comment -

          Anyone got a patch for this?

          Show
          stack added a comment - Anyone got a patch for this?
          Hide
          stack added a comment -

          Marking critical

          Show
          stack added a comment - Marking critical
          Hide
          Lars Hofhansl added a comment -

          For 0.96, though.
          I can take this. Can't promise to work on it today, though.
          Had checked out the code a while ago to make ICV correct w.r.t. to WAL updates, this involves mostly removing some crufty code.

          Show
          Lars Hofhansl added a comment - For 0.96, though. I can take this. Can't promise to work on it today, though. Had checked out the code a while ago to make ICV correct w.r.t. to WAL updates, this involves mostly removing some crufty code.
          Hide
          Lars Hofhansl added a comment -

          Here's a patch. Quite trivial, just removing some code and using the existing get without invoking coprocessors

          BUT... To test the performance I jacked up the number of increments per thread in TestAtomicOperation.testIncrementMultiThreads to 10000 and ran it with and without this change.

          On my machine I find that the test runs around 23-24s with the existing code and around 27-28s with this patch.

          Not sure that is a hit we want take.

          Show
          Lars Hofhansl added a comment - Here's a patch. Quite trivial, just removing some code and using the existing get without invoking coprocessors BUT... To test the performance I jacked up the number of increments per thread in TestAtomicOperation.testIncrementMultiThreads to 10000 and ran it with and without this change. On my machine I find that the test runs around 23-24s with the existing code and around 27-28s with this patch. Not sure that is a hit we want take.
          Hide
          Lars Hofhansl added a comment -

          I also verified that the test included in the patch fails without the other changes.

          Show
          Lars Hofhansl added a comment - I also verified that the test included in the patch fails without the other changes.
          Hide
          stack added a comment -

          +1 on patch.

          It makes us correct in the face of deletes. That is more important than a possible 10% slow down.

          Show
          stack added a comment - +1 on patch. It makes us correct in the face of deletes. That is more important than a possible 10% slow down.
          Hide
          Lars Hofhansl added a comment -

          Committed to trunk

          Show
          Lars Hofhansl added a comment - Committed to trunk
          Hide
          stack added a comment -

          0.94?

          Show
          stack added a comment - 0.94?
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2746 (See https://builds.apache.org/job/HBase-TRUNK/2746/)
          HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325406)

          Result = SUCCESS
          larsh :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2746 (See https://builds.apache.org/job/HBase-TRUNK/2746/ ) HBASE-3443 ICV optimization to look in memstore first and then store files ( HBASE-3082 ) does not work when deletes are in the mix (Revision 1325406) Result = SUCCESS larsh : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Hide
          Lars Hofhansl added a comment -

          @Stack: Wasn't sure about this one. Didn't feel right to add this to a "performance release"

          You think this should go into 0.94? Might be good to have this correctness fix early.

          Show
          Lars Hofhansl added a comment - @Stack: Wasn't sure about this one. Didn't feel right to add this to a "performance release" You think this should go into 0.94? Might be good to have this correctness fix early.
          Hide
          Jean-Daniel Cryans added a comment -

          Correctness should always come first IMO.

          Show
          Jean-Daniel Cryans added a comment - Correctness should always come first IMO.
          Hide
          Jean-Daniel Cryans added a comment -

          The release notes should mention this tho.

          Show
          Jean-Daniel Cryans added a comment - The release notes should mention this tho.
          Hide
          stack added a comment -

          What J-D said.

          Show
          stack added a comment - What J-D said.
          Hide
          stack added a comment -

          Oh, and if you don't fix it, you'll have to explain why you didn't to Benoît.

          Show
          stack added a comment - Oh, and if you don't fix it, you'll have to explain why you didn't to Benoît.
          Hide
          Lars Hofhansl added a comment -

          a'right, committed to 0.94 aswell, and added release notes.

          Show
          Lars Hofhansl added a comment - a'right, committed to 0.94 aswell, and added release notes.
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94 #107 (See https://builds.apache.org/job/HBase-0.94/107/)
          HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325453)

          Result = SUCCESS
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Show
          Hudson added a comment - Integrated in HBase-0.94 #107 (See https://builds.apache.org/job/HBase-0.94/107/ ) HBASE-3443 ICV optimization to look in memstore first and then store files ( HBASE-3082 ) does not work when deletes are in the mix (Revision 1325453) Result = SUCCESS larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK-security #169 (See https://builds.apache.org/job/HBase-TRUNK-security/169/)
          HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325406)

          Result = FAILURE
          larsh :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK-security #169 (See https://builds.apache.org/job/HBase-TRUNK-security/169/ ) HBASE-3443 ICV optimization to look in memstore first and then store files ( HBASE-3082 ) does not work when deletes are in the mix (Revision 1325406) Result = FAILURE larsh : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Hide
          Hudson added a comment -

          Integrated in HBase-0.94-security #9 (See https://builds.apache.org/job/HBase-0.94-security/9/)
          HBASE-3443 ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix (Revision 1325453)

          Result = SUCCESS
          larsh :
          Files :

          • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
          • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
          Show
          Hudson added a comment - Integrated in HBase-0.94-security #9 (See https://builds.apache.org/job/HBase-0.94-security/9/ ) HBASE-3443 ICV optimization to look in memstore first and then store files ( HBASE-3082 ) does not work when deletes are in the mix (Revision 1325453) Result = SUCCESS larsh : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java

            People

            • Assignee:
              Lars Hofhansl
              Reporter:
              Kannan Muthukkaruppan
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development