HBase
  1. HBase
  2. HBASE-9730

Exceptions in multi operations are not handled correctly

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.96.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The symptoms are that, both ITBLL and ITLAV fail in their verification steps complaining about lots of undefined rows.

      org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$Counts
                      REFERENCED=199619372
                      UNDEFINED=190084
                      UNREFERENCED=190084
      

      I think the problem is in HRegionServer.doBatchOp() where in case HRegion.batchMutate() throws an exception, RegionActionResult indexes are not set correctly.

      1. hbase-9730_v2.patch
        1 kB
        Enis Soztutar
      2. hbase-9730_v1-0.96.patch
        5 kB
        Enis Soztutar
      3. hbase-9730_v1.patch
        11 kB
        Enis Soztutar

        Activity

        Enis Soztutar created issue -
        Hide
        Enis Soztutar added a comment -

        Attaching a patch, which fixes this condition. Still running the tests to see whether the theory in the description is correct.

        Show
        Enis Soztutar added a comment - Attaching a patch, which fixes this condition. Still running the tests to see whether the theory in the description is correct.
        Enis Soztutar made changes -
        Field Original Value New Value
        Attachment hbase-9730_v1-0.96.patch [ 12607486 ]
        Attachment hbase-9730_v1.patch [ 12607487 ]
        Hide
        Sergey Shelukhin added a comment -

        According to our observation IOException from batch mutate might be RegionNotServedException when region is closing. So it seems like it;s consistent with observation.
        I can give +1 but my understanding of RPC code is limited

        Show
        Sergey Shelukhin added a comment - According to our observation IOException from batch mutate might be RegionNotServedException when region is closing. So it seems like it;s consistent with observation. I can give +1 but my understanding of RPC code is limited
        Hide
        Jeffrey Zhong added a comment -

        Good Catch! Enis Soztutar I think your patch fixes the root cause. +1

        Show
        Jeffrey Zhong added a comment - Good Catch! Enis Soztutar I think your patch fixes the root cause. +1
        Hide
        Sergey Shelukhin added a comment -

        sorry, -0 for the temporary check, what is the perf implications of that?

        Show
        Sergey Shelukhin added a comment - sorry, -0 for the temporary check, what is the perf implications of that?
        Hide
        Enis Soztutar added a comment -

        sorry, -0 for the temporary check, what is the perf implications of that?

        I think we should check the region boundaries for a region operation. We cannot rely on the client to tell the truth. We do the check in HRegion.processRowsWithLocks() already. However, agreed that we should do this in a separate patch.

        Show
        Enis Soztutar added a comment - sorry, -0 for the temporary check, what is the perf implications of that? I think we should check the region boundaries for a region operation. We cannot rely on the client to tell the truth. We do the check in HRegion.processRowsWithLocks() already. However, agreed that we should do this in a separate patch.
        Hide
        Enis Soztutar added a comment -

        v2, only contains changes to HRS. ITBLL with CM succeeded 2 iterations so far on a 7 node cluster, versus the tests used to fail on the first iteration. So we are more confident that this is the root cause.

        How can we write a unit test for this?

        Stack, Nicolas Liochon do you mind taking a look?

        Show
        Enis Soztutar added a comment - v2, only contains changes to HRS. ITBLL with CM succeeded 2 iterations so far on a 7 node cluster, versus the tests used to fail on the first iteration. So we are more confident that this is the root cause. How can we write a unit test for this? Stack , Nicolas Liochon do you mind taking a look?
        Enis Soztutar made changes -
        Attachment hbase-9730_v2.patch [ 12607493 ]
        Hide
        stack added a comment -

        Thats dumb (my mistake). Thanks Enis Soztutar +1

        Show
        stack added a comment - Thats dumb (my mistake). Thanks Enis Soztutar +1
        Enis Soztutar made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Enis Soztutar added a comment -

        Thanks for looking. Will commit this if hadoopqa reports ok.

        Show
        Enis Soztutar added a comment - Thanks for looking. Will commit this if hadoopqa reports ok.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12607493/hbase-9730_v2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 hadoop1.0. The patch compiles against the hadoop 1.0 profile.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 lineLengths. The patch does not introduce lines longer than 100

        -1 site. The patch appears to cause mvn site goal to fail.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.TestCheckTestClasses

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607493/hbase-9730_v2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop1.0 . The patch compiles against the hadoop 1.0 profile. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 -1 site . The patch appears to cause mvn site goal to fail. -1 core tests . The patch failed these unit tests: org.apache.hadoop.hbase.TestCheckTestClasses Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7498//console This message is automatically generated.
        Hide
        Nicolas Liochon added a comment -

        +1 for v2. Well done Enis.

        Show
        Nicolas Liochon added a comment - +1 for v2. Well done Enis.
        Hide
        stack added a comment -

        org.apache.hadoop.hbase.TestCheckTestClasses failed for an unrelated Lars patch too just now so doubt it you... It actually fails locally for me. Let me fix.

        Show
        stack added a comment - org.apache.hadoop.hbase.TestCheckTestClasses failed for an unrelated Lars patch too just now so doubt it you... It actually fails locally for me. Let me fix.
        Hide
        stack added a comment -

        My checkin of a test broke TestCheckTestClasses last night. Fixing now.

        Show
        stack added a comment - My checkin of a test broke TestCheckTestClasses last night. Fixing now.
        Hide
        Enis Soztutar added a comment -

        Committed v2 to 0.96 and trunk. Thanks for looking.

        Show
        Enis Soztutar added a comment - Committed v2 to 0.96 and trunk. Thanks for looking.
        Enis Soztutar made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96 #130 (See https://builds.apache.org/job/hbase-0.96/130/)
        HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530729)

        • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96 #130 (See https://builds.apache.org/job/hbase-0.96/130/ ) HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530729) /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in HBase-TRUNK #4603 (See https://builds.apache.org/job/HBase-TRUNK/4603/)
        HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530728)

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #4603 (See https://builds.apache.org/job/HBase-TRUNK/4603/ ) HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530728) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.96-hadoop2 #82 (See https://builds.apache.org/job/hbase-0.96-hadoop2/82/)
        HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530729)

        • /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.96-hadoop2 #82 (See https://builds.apache.org/job/hbase-0.96-hadoop2/82/ ) HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530729) /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #785 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/785/)
        HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530728)

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #785 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/785/ ) HBASE-9730 Exceptions in multi operations are not handled correctly (enis: rev 1530728) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        5h 31m 1 Enis Soztutar 09/Oct/13 07:57
        Patch Available Patch Available Resolved Resolved
        10h 45m 1 Enis Soztutar 09/Oct/13 18:43

          People

          • Assignee:
            Enis Soztutar
            Reporter:
            Enis Soztutar
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development