HBase
  1. HBase
  2. HBASE-10

HRegionServer hangs upon exit due to DFSClient Exception

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2.0
    • Component/s: regionserver
    • Labels:
      None
    • Environment:

      CentOS 5

      Description

      Several HRegionServers hang around indefinitely well after the HMaster has exited. This was triggered executing $HBASE_HOME/bin/stop-hbase.sh. The HMaster exists fine, but here is what happens on one of the HRegionServers:

      2008-01-02 18:54:01,907 INFO org.apache.hadoop.hbase.HRegionServer: Got regionserver stop message
      2008-01-02 18:54:01,907 INFO org.apache.hadoop.hbase.Leases: regionserver/0.0.0.0:60020 closing leases
      2008-01-02 18:54:01,907 INFO org.apache.hadoop.hbase.Leases$LeaseMonitor: regionserver/0.0.0.0:60020.leaseChecker exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.hbase.Leases: regionserver/0.0.0.0:60020 closed leases
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: Stopping server on 60020
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 60020: exiting
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020: exiting
      2008-01-02 18:54:01,909 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 60020
      2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020: exiting
      2008-01-02 18:54:01,909 INFO org.apache.hadoop.hbase.HRegionServer: Stopping infoServer
      2008-01-02 18:54:01,909 DEBUG org.mortbay.util.Container: Stopping org.mortbay.jetty.Server@62c09554
      2008-01-02 18:54:01,909 DEBUG org.mortbay.util.ThreadedServer: closing ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
      2008-01-02 18:54:01,909 DEBUG org.mortbay.util.ThreadedServer: IGNORED
      java.net.SocketException: Socket closed
      at java.net.PlainSocketImpl.socketAccept(Native Method)
      at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
      at java.net.ServerSocket.implAccept(ServerSocket.java:453)
      at java.net.ServerSocket.accept(ServerSocket.java:421)
      at org.mortbay.util.ThreadedServer.acceptSocket(ThreadedServer.java:432)
      at org.mortbay.util.ThreadedServer$Acceptor.run(ThreadedServer.java:631)
      2008-01-02 18:54:01,910 INFO org.mortbay.util.ThreadedServer: Stopping Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
      2008-01-02 18:54:01,910 DEBUG org.mortbay.util.ThreadedServer: Self connect to close listener /127.0.0.1:60030
      2008-01-02 18:54:01,911 DEBUG org.mortbay.util.ThreadedServer: problem stopping acceptor /127.0.0.1:
      2008-01-02 18:54:01,911 DEBUG org.mortbay.util.ThreadedServer: problem stopping acceptor /127.0.0.1:
      java.net.ConnectException: Connection refused
      at java.net.PlainSocketImpl.socketConnect(Native Method)
      at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
      at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
      at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
      at java.net.Socket.connect(Socket.java:519)
      at java.net.Socket.connect(Socket.java:469)
      at java.net.Socket.<init>(Socket.java:366)
      at java.net.Socket.<init>(Socket.java:209)
      at org.mortbay.util.ThreadedServer$Acceptor.forceStop(ThreadedServer.java:682)
      at org.mortbay.util.ThreadedServer.stop(ThreadedServer.java:557)
      at org.mortbay.http.SocketListener.stop(SocketListener.java:211)
      at org.mortbay.http.HttpServer.doStop(HttpServer.java:781)
      at org.mortbay.util.Container.stop(Container.java:154)
      at org.apache.hadoop.hbase.util.InfoServer.stop(InfoServer.java:237)
      at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:835)
      at java.lang.Thread.run(Thread.java:619)
      2008-01-02 18:54:01,911 INFO org.mortbay.http.SocketListener: Stopped SocketListener on 0.0.0.0:60030
      2008-01-02 18:54:01,912 DEBUG org.mortbay.util.Container: Stopping HttpContext[/static,/static]
      2008-01-02 18:54:01,912 DEBUG org.mortbay.http.handler.AbstractHttpHandler: Stopped org.mortbay.http.handler.ResourceHandler in HttpContext[/static,/static]
      2008-01-02 18:54:02,039 INFO org.mortbay.util.Container: Stopped HttpContext[/static,/static]
      2008-01-02 18:54:02,039 DEBUG org.mortbay.util.Container: Stopping HttpContext[/logs,/logs]
      2008-01-02 18:54:02,039 DEBUG org.mortbay.http.handler.AbstractHttpHandler: Stopped org.mortbay.http.handler.ResourceHandler in HttpContext[/logs,/logs]
      2008-01-02 18:54:02,154 INFO org.mortbay.util.Container: Stopped HttpContext[/logs,/logs]
      2008-01-02 18:54:02,154 DEBUG org.mortbay.util.Container: Stopping WebApplicationContext[/,/]
      2008-01-02 18:54:02,154 DEBUG org.mortbay.util.Container: Stopping org.mortbay.jetty.servlet.WebApplicationHandler@7ec5495e
      2008-01-02 18:54:02,155 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.WebApplicationHandler@7ec5495e
      2008-01-02 18:54:02,277 DEBUG org.mortbay.jetty.servlet.AbstractSessionManager: Session scavenger exited
      2008-01-02 18:54:02,278 DEBUG org.mortbay.util.Container: remove component: org.mortbay.jetty.servlet.WebApplicationHandler@7ec5495e
      2008-01-02 18:54:02,278 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/,/]
      2008-01-02 18:54:02,278 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.Server@62c09554
      2008-01-02 18:54:02,278 DEBUG org.apache.hadoop.hbase.HRegionServer: closing region spider_pages,10_131455761,1198140179439
      2008-01-02 18:54:02,278 INFO org.apache.hadoop.hbase.HRegionServer: regionserver/0.0.0.0:60020.cacheFlusher exiting
      2008-01-02 18:54:02,278 INFO org.apache.hadoop.hbase.HRegionServer: regionserver/0.0.0.0:60020.compactor exiting
      2008-01-02 18:54:02,278 INFO org.apache.hadoop.hbase.HRegionServer: regionserver/0.0.0.0:60020.splitter exiting
      2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HStore: closed spider_pages,10_131455761,1198140179439/search (1501227429/search)
      2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HStore: closed spider_pages,10_131455761,1198140179439/profile (1501227429/profile)
      2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HStore: closed spider_pages,10_131455761,1198140179439/meta (1501227429/meta)
      2008-01-02 18:54:02,279 INFO org.apache.hadoop.hbase.HRegion: closed spider_pages,10_131455761,1198140179439
      2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HRegionServer: closing region spider_pages,10_486594261,1198319654267
      2008-01-02 18:54:02,280 DEBUG org.apache.hadoop.hbase.HStore: closed spider_pages,10_486594261,1198319654267/search (364081590/search)
      2008-01-02 18:54:02,280 DEBUG org.apache.hadoop.hbase.HStore: closed spider_pages,10_486594261,1198319654267/profile (364081590/profile)
      2008-01-02 18:54:02,280 DEBUG org.apache.hadoop.hbase.HStore: closed spider_pages,10_486594261,1198319654267/meta (364081590/meta)
      2008-01-02 18:54:02,280 INFO org.apache.hadoop.hbase.HRegion: closed spider_pages,10_486594261,1198319654267
      ...
      ... this closing of regions goes on for a while
      ...
      ... the following continues until a kill -9
      ...
      2008-01-02 20:39:20,552 INFO org.apache.hadoop.fs.DFSClient: Could not obtain block blk_5124700261538503923 from any node: java.io.IOException: No live nodes contain current block
      2008-01-02 20:40:23,556 INFO org.apache.hadoop.fs.DFSClient: Could not obtain block blk_5124700261538503923 from any node: java.io.IOException: No live nodes contain current block
      2008-01-02 20:41:26,560 INFO org.apache.hadoop.fs.DFSClient: Could not obtain block blk_5124700261538503923 from any node: java.io.IOException: No live nodes contain current block
      2008-01-02 20:42:29,566 INFO org.apache.hadoop.fs.DFSClient: Could not obtain block blk_5124700261538503923 from any node: java.io.IOException: No live nodes contain current block
      2008-01-02 20:43:32,571 INFO org.apache.hadoop.fs.DFSClient: Could not obtain block blk_5124700261538503923 from any node: java.io.IOException: No live nodes contain current block
      ...

      1. patch.txt
        15 kB
        Jim Kellerman

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Jim Kellerman
              Reporter:
              Chris Kline
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development