Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.98.0
    • Fix Version/s: 0.99.0, 0.98.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When I play with ITBLL (from trunk tip), sometimes, meta scan hangs when the cluster is rolling restarted. When this happens, the master takes about 1000% of CPU. It looks like there is an infinite loop somewhere. The logs show nothing interesting except some meta scanner RPC calls timed out. Jstask shows the 10 high QoS RPC handlers are busy with meta scanning.

      However, if I run it again without HBASE-10018, things are fine. I suspect there is something to do with the small/reverse scan.

      By the way, I see this problem even with log replay off and hfile version = 2.

      1. hbase-10949.patch
        3 kB
        Jimmy Xiang
      2. master-jstack.log
        119 kB
        Jimmy Xiang

        Issue Links

          Activity

          Jimmy Xiang created issue -
          Jimmy Xiang made changes -
          Field Original Value New Value
          Attachment master-jstack.log [ 12639452 ]
          Jimmy Xiang made changes -
          Link This issue relates to HBASE-10018 [ HBASE-10018 ]
          Jimmy Xiang made changes -
          Priority Critical [ 2 ] Blocker [ 1 ]
          Jimmy Xiang made changes -
          Assignee Jimmy Xiang [ jxiang ]
          Hide
          Jimmy Xiang added a comment -

          Looked into it and found StoreFileScanner#backwardSeek always returns true in some case. I am testing a patch so that if seek() return false, don't return true.

          Show
          Jimmy Xiang added a comment - Looked into it and found StoreFileScanner#backwardSeek always returns true in some case. I am testing a patch so that if seek() return false, don't return true.
          Jimmy Xiang made changes -
          Summary Meta scan could hang Reversed scan could hang
          Jimmy Xiang made changes -
          Priority Blocker [ 1 ] Critical [ 2 ]
          Jimmy Xiang made changes -
          Affects Version/s 0.98.0 [ 12323143 ]
          Hide
          Jimmy Xiang added a comment -

          I think I found the problem. In StoreFileScanner, we always use Bytes.compareTo(), which is wrong. We should use the right comparator. For meta, the comparator is a little different, that's why the problem shows up with meta scan.

          Show
          Jimmy Xiang added a comment - I think I found the problem. In StoreFileScanner, we always use Bytes.compareTo(), which is wrong. We should use the right comparator. For meta, the comparator is a little different, that's why the problem shows up with meta scan.
          Hide
          stack added a comment -

          That is a classic Jimmy Xiang

          Show
          stack added a comment - That is a classic Jimmy Xiang
          Hide
          Jimmy Xiang added a comment -

          Attached a patch that uses the comparator of hfile (reader). ITBLL works fine with this patch.

          Show
          Jimmy Xiang added a comment - Attached a patch that uses the comparator of hfile (reader). ITBLL works fine with this patch.
          Jimmy Xiang made changes -
          Attachment hbase-10949.patch [ 12639933 ]
          Jimmy Xiang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Ted Yu added a comment -

          +1

          Show
          Ted Yu added a comment - +1
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12639933/hbase-10949.patch
          against trunk revision .
          ATTACHMENT ID: 12639933

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 lineLengths. The patch does not introduce lines longer than 100

          +1 site. The mvn site goal succeeds with this patch.

          +1 core tests. The patch passed unit tests in .

          Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
          Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639933/hbase-10949.patch against trunk revision . ATTACHMENT ID: 12639933 +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 lineLengths . The patch does not introduce lines longer than 100 +1 site . The mvn site goal succeeds with this patch. +1 core tests . The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9270//console This message is automatically generated.
          Hide
          Jimmy Xiang added a comment -

          Integrated into trunk and 0.98. Thanks Ted for the review.

          Show
          Jimmy Xiang added a comment - Integrated into trunk and 0.98. Thanks Ted for the review.
          Jimmy Xiang made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s 0.98.2 [ 12326505 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-TRUNK #5082 (See https://builds.apache.org/job/HBase-TRUNK/5082/)
          HBASE-10949 Reversed scan could hang (jxiang: rev 1587061)

          • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-TRUNK #5082 (See https://builds.apache.org/job/HBase-TRUNK/5082/ ) HBASE-10949 Reversed scan could hang (jxiang: rev 1587061) /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in HBase-0.98 #275 (See https://builds.apache.org/job/HBase-0.98/275/)
          HBASE-10949 Reversed scan could hang (jxiang: rev 1587070)

          • /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
          Show
          Hudson added a comment - SUCCESS: Integrated in HBase-0.98 #275 (See https://builds.apache.org/job/HBase-0.98/275/ ) HBASE-10949 Reversed scan could hang (jxiang: rev 1587070) /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #259 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/259/)
          HBASE-10949 Reversed scan could hang (jxiang: rev 1587070)

          • /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
          Show
          Hudson added a comment - FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #259 (See https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/259/ ) HBASE-10949 Reversed scan could hang (jxiang: rev 1587070) /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
          Hide
          ramkrishna.s.vasudevan added a comment -

          Nice.

          Show
          ramkrishna.s.vasudevan added a comment - Nice.
          Hide
          Enis Soztutar added a comment -

          Closing this issue after 0.99.0 release.

          Show
          Enis Soztutar added a comment - Closing this issue after 0.99.0 release.
          Enis Soztutar made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          2d 19h 59m 1 Jimmy Xiang 12/Apr/14 15:55
          Patch Available Patch Available Resolved Resolved
          1d 3h 42m 1 Jimmy Xiang 13/Apr/14 19:38
          Resolved Resolved Closed Closed
          314d 4h 55m 1 Enis Soztutar 21/Feb/15 23:33

            People

            • Assignee:
              Jimmy Xiang
              Reporter:
              Jimmy Xiang
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development