HBase
  1. HBase
  2. HBASE-5578

NPE when regionserver reported server load, caused rs stop.

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.92.0
    • Fix Version/s: 0.92.3
    • Component/s: regionserver
    • Labels:
      None
    • Environment:

      centos6.2 hadoop-1.0.0 hbase-0.92.0

      Description

      The regeionserver log:
      2012-03-11 11:55:37,808 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server data3,60020,1331286604591: Unhandled exception: null
      java.lang.NullPointerException
      at org.apache.hadoop.hbase.regionserver.Store.getTotalStaticIndexSize(Store.java:1788)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:994)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:800)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:776)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:678)
      at java.lang.Thread.run(Thread.java:662)
      2012-03-11 11:55:37,808 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
      2012-03-11 11:55:37,808 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: requestsPerSecond=1687, numberOfOnlineRegions=37, numberOfStores=37, numberOfStorefiles=144, storefileIndexSizeMB=2, rootIndexSizeKB=2362, totalStaticIndexSizeKB=229808, totalStaticBloomSizeKB=2166296, memstoreSizeMB=2854, readRequestsCount=1352673, writeRequestsCount=113137586, compactionQueueSize=8, flushQueueSize=3, usedHeapMB=7359, maxHeapMB=12999, blockCacheSizeMB=32.31, blockCacheFreeMB=3867.52, blockCacheCount=38, blockCacheHitCount=87713, blockCacheMissCount=22144560, blockCacheEvictedCount=122, blockCacheHitRatio=0%, blockCacheHitCachingRatio=99%, hdfsBlocksLocalityIndex=100
      2012-03-11 11:55:37,992 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled exception: null

        Activity

        Hide
        stack added a comment -

        Thats pretty nasty. Can you reproduce easily Storm? Is it happening all the time? Is this fresh start up or something? You have no content under /hbase?

        Show
        stack added a comment - Thats pretty nasty. Can you reproduce easily Storm? Is it happening all the time? Is this fresh start up or something? You have no content under /hbase?
        Hide
        Storm Lee added a comment -

        I meet it only once, may not reproduce easily. My Hbase cluster has 9 RSes, only one crushed. It is fresh start up , nothing under /hbase. And than I use a tool to put data all the time(use HTable.put()). This RS runs about 45 hours already when the NPE happened. The compaction and split also continued all the time.

        Show
        Storm Lee added a comment - I meet it only once, may not reproduce easily. My Hbase cluster has 9 RSes, only one crushed. It is fresh start up , nothing under /hbase. And than I use a tool to put data all the time(use HTable.put()). This RS runs about 45 hours already when the NPE happened. The compaction and split also continued all the time.
        Hide
        stack added a comment -

        How about this. Goes through Store and checks all Reader instances for null before using. We were doing this in half the cases already.

        Converts the NPE into a null warning. Means we don't crash. Puts off having to spend time on why the Reader is null at particular junctures.

        Should go into 0.94?

        Show
        stack added a comment - How about this. Goes through Store and checks all Reader instances for null before using. We were doing this in half the cases already. Converts the NPE into a null warning. Means we don't crash. Puts off having to spend time on why the Reader is null at particular junctures. Should go into 0.94?
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12518726/5589.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 162 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.regionserver.TestKeepDeletes
        org.apache.hadoop.hbase.regionserver.TestMinVersions
        org.apache.hadoop.hbase.regionserver.TestCompaction

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518726/5589.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 162 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestKeepDeletes org.apache.hadoop.hbase.regionserver.TestMinVersions org.apache.hadoop.hbase.regionserver.TestCompaction Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//console This message is automatically generated.
        Hide
        Jean-Daniel Cryans added a comment -

        Unmarking patch available, it's a year old.

        Show
        Jean-Daniel Cryans added a comment - Unmarking patch available, it's a year old.

          People

          • Assignee:
            Unassigned
            Reporter:
            Storm Lee
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development