HBase
  1. HBase
  2. HBASE-1327

NPE in HRegionServer$ToDoEntry.access$100

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.19.1
    • Fix Version/s: 0.20.2, 0.90.0
    • Component/s: None
    • Labels:
      None

      Description

      From hbase-users@:

      From: Rakhi Khatwani
      Subject: Null pointer exception

      My hbase suddenly goes down, when i check the logs, i get the following exception at master node's region server:

      2009-04-15 08:37:09,158 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
      java.lang.NullPointerException
      at org.apache.hadoop.hbase.regionserver.HRegionServer$ToDoEntry.access$100(HRegionServer.java:1201)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.housekeeping(HRegionServer.java:1058)
      at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:450)
      at java.lang.Thread.run(Thread.java:619)
      2009-04-15 08:37:09,159 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=27, regions=42, stores=202, storefiles=247, storefileIndexSize=0, memcacheSize=0, usedHeap=116, maxHeap=888

        Activity

        Hide
        Hyunsik Choi added a comment -

        I have experienced the same situation. In my case, this error occurs at master node's region server.


        09/10/29 12:09:28 INFO regionserver.HRegionServer: Telling master at 163.X.X.X :60000 that we are up
        09/10/29 12:09:28 FATAL regionserver.HRegionServer: Unhandled exception. Aborting...
        java.lang.NullPointerException
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459)
        at java.lang.Thread.run(Thread.java:619)
        09/10/29 12:09:28 INFO regionserver.HRegionServer: Dump of metrics: request=0.0, regions=0, stores=0, storefiles=0, storefileIndexSize=0, memstoreSize=0, usedHeap=31, maxHeap=987, blockCacheSize=1700064, blockCacheFree=205393696, blockCacheCount=0, blockCacheHitRatio=0
        09/10/29 12:09:28 INFO ipc.HBaseServer: Stopping server on 60020
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 0 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: Stopping IPC Server listener on 60020
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 8 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 3 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 2 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 1 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 7 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: Stopping IPC Server Responder
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 9 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 6 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 5 on 60020: exiting
        09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 4 on 60020: exiting
        09/10/29 12:09:28 INFO regionserver.HRegionServer: Stopping infoServer
        09/10/29 12:09:29 INFO regionserver.LogFlusher: regionserver/163.X.X.X:60020.logFlusher exiting
        09/10/29 12:09:29 INFO regionserver.MemStoreFlusher: regionserver/163.X.X.X:60020.cacheFlusher exiting
        ...

        Show
        Hyunsik Choi added a comment - I have experienced the same situation. In my case, this error occurs at master node's region server. 09/10/29 12:09:28 INFO regionserver.HRegionServer: Telling master at 163.X.X.X :60000 that we are up 09/10/29 12:09:28 FATAL regionserver.HRegionServer: Unhandled exception. Aborting... java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459) at java.lang.Thread.run(Thread.java:619) 09/10/29 12:09:28 INFO regionserver.HRegionServer: Dump of metrics: request=0.0, regions=0, stores=0, storefiles=0, storefileIndexSize=0, memstoreSize=0, usedHeap=31, maxHeap=987, blockCacheSize=1700064, blockCacheFree=205393696, blockCacheCount=0, blockCacheHitRatio=0 09/10/29 12:09:28 INFO ipc.HBaseServer: Stopping server on 60020 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 0 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: Stopping IPC Server listener on 60020 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 8 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 3 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 2 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 1 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 7 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: Stopping IPC Server Responder 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 9 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 6 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 5 on 60020: exiting 09/10/29 12:09:28 INFO ipc.HBaseServer: IPC Server handler 4 on 60020: exiting 09/10/29 12:09:28 INFO regionserver.HRegionServer: Stopping infoServer 09/10/29 12:09:29 INFO regionserver.LogFlusher: regionserver/163.X.X.X:60020.logFlusher exiting 09/10/29 12:09:29 INFO regionserver.MemStoreFlusher: regionserver/163.X.X.X:60020.cacheFlusher exiting ...
        Hide
        stack added a comment -

        Moving to 0.20.2. Looks like simple enough synchronization problem – someone is removing from queue while its being iterated.

        Show
        stack added a comment - Moving to 0.20.2. Looks like simple enough synchronization problem – someone is removing from queue while its being iterated.
        Hide
        Jean-Daniel Cryans added a comment -

        Hyunsik's problem was fixed by HBASE-1946.

        I haven't seen the original problem since this jira was created. Should we close it?

        Show
        Jean-Daniel Cryans added a comment - Hyunsik's problem was fixed by HBASE-1946 . I haven't seen the original problem since this jira was created. Should we close it?
        Hide
        Jean-Daniel Cryans added a comment -

        Also this issue was filled on 0.19.1 when we were doing :

        for (ToDoEntry e: this.toDo) {
          if (e.msg.isType(HMsg.Type.MSG_REGION_OPEN)) {
            addProcessingMessage(e.msg.getRegionInfo());
          }
        } 
        

        Now we check if the message is null. In both NPEs, this bug was already fixed so I'll mark it as duplicate.

        Show
        Jean-Daniel Cryans added a comment - Also this issue was filled on 0.19.1 when we were doing : for (ToDoEntry e: this .toDo) { if (e.msg.isType(HMsg.Type.MSG_REGION_OPEN)) { addProcessingMessage(e.msg.getRegionInfo()); } } Now we check if the message is null. In both NPEs, this bug was already fixed so I'll mark it as duplicate.

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrew Purtell
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development