HBase
  1. HBase
  2. HBASE-5578

NPE when regionserver reported server load, caused rs stop.

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.92.0
    • Fix Version/s: 0.92.3
    • Component/s: regionserver
    • Labels:
      None
    • Environment:

      centos6.2 hadoop-1.0.0 hbase-0.92.0

      Description

      The regeionserver log:
      2012-03-11 11:55:37,808 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server data3,60020,1331286604591: Unhandled exception: null
      java.lang.NullPointerException
      at org.apache.hadoop.hbase.regionserver.Store.getTotalStaticIndexSize(Store.java:1788)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:994)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:800)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:776)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:678)
      at java.lang.Thread.run(Thread.java:662)
      2012-03-11 11:55:37,808 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
      2012-03-11 11:55:37,808 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: requestsPerSecond=1687, numberOfOnlineRegions=37, numberOfStores=37, numberOfStorefiles=144, storefileIndexSizeMB=2, rootIndexSizeKB=2362, totalStaticIndexSizeKB=229808, totalStaticBloomSizeKB=2166296, memstoreSizeMB=2854, readRequestsCount=1352673, writeRequestsCount=113137586, compactionQueueSize=8, flushQueueSize=3, usedHeapMB=7359, maxHeapMB=12999, blockCacheSizeMB=32.31, blockCacheFreeMB=3867.52, blockCacheCount=38, blockCacheHitCount=87713, blockCacheMissCount=22144560, blockCacheEvictedCount=122, blockCacheHitRatio=0%, blockCacheHitCachingRatio=99%, hdfsBlocksLocalityIndex=100
      2012-03-11 11:55:37,992 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled exception: null

        Activity

        Storm Lee created issue -
        Hide
        stack added a comment -

        Thats pretty nasty. Can you reproduce easily Storm? Is it happening all the time? Is this fresh start up or something? You have no content under /hbase?

        Show
        stack added a comment - Thats pretty nasty. Can you reproduce easily Storm? Is it happening all the time? Is this fresh start up or something? You have no content under /hbase?
        stack made changes -
        Field Original Value New Value
        Priority Major [ 3 ] Critical [ 2 ]
        Hide
        Storm Lee added a comment -

        I meet it only once, may not reproduce easily. My Hbase cluster has 9 RSes, only one crushed. It is fresh start up , nothing under /hbase. And than I use a tool to put data all the time(use HTable.put()). This RS runs about 45 hours already when the NPE happened. The compaction and split also continued all the time.

        Show
        Storm Lee added a comment - I meet it only once, may not reproduce easily. My Hbase cluster has 9 RSes, only one crushed. It is fresh start up , nothing under /hbase. And than I use a tool to put data all the time(use HTable.put()). This RS runs about 45 hours already when the NPE happened. The compaction and split also continued all the time.
        Hide
        stack added a comment -

        How about this. Goes through Store and checks all Reader instances for null before using. We were doing this in half the cases already.

        Converts the NPE into a null warning. Means we don't crash. Puts off having to spend time on why the Reader is null at particular junctures.

        Should go into 0.94?

        Show
        stack added a comment - How about this. Goes through Store and checks all Reader instances for null before using. We were doing this in half the cases already. Converts the NPE into a null warning. Means we don't crash. Puts off having to spend time on why the Reader is null at particular junctures. Should go into 0.94?
        stack made changes -
        Attachment 5589.txt [ 12518726 ]
        stack made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        stack made changes -
        Fix Version/s 0.92.2 [ 12319888 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12518726/5589.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 162 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.regionserver.TestKeepDeletes
        org.apache.hadoop.hbase.regionserver.TestMinVersions
        org.apache.hadoop.hbase.regionserver.TestCompaction

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518726/5589.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 162 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestKeepDeletes org.apache.hadoop.hbase.regionserver.TestMinVersions org.apache.hadoop.hbase.regionserver.TestCompaction Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1209//console This message is automatically generated.
        stack made changes -
        Fix Version/s 0.92.3 [ 12321692 ]
        Fix Version/s 0.92.2 [ 12319888 ]
        Hide
        Jean-Daniel Cryans added a comment -

        Unmarking patch available, it's a year old.

        Show
        Jean-Daniel Cryans added a comment - Unmarking patch available, it's a year old.
        Jean-Daniel Cryans made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        2d 11h 9m 1 stack 16/Mar/12 20:09
        Patch Available Patch Available Open Open
        413d 3h 8m 1 Jean-Daniel Cryans 04/May/13 00:17

          People

          • Assignee:
            Unassigned
            Reporter:
            Storm Lee
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development