Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19768

RegionServer startup failing when DN is dead

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.0.0-beta-2, 2.0.0
    • None
    • None

    Description

      When starting HBase, if the datanode hosted on the same host is dead but not yet detected by the namenode, HBase will fail to start

      515691223393/node8.distparser.com%2C16020%2C1515691223393.1515691238778 failed, retry = 7
      org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connexion refusée: /192.168.23.2:50010
      	at org.apache.hbase.thirdparty.io.netty.channel.unix.Socket.finishConnect(..)(Unknown Source)
      Caused by: org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeConnectException: syscall:getsockopt(..) failed: Connexion refusée
      	... 1 more
      

      and will also get stuck to stop:

      hbase@node2:~/hbase-2.0.0-beta-1$ bin/stop-hbase.sh 
      stopping hbase....................................................................................................................................................................................................^C
      hbase@node2:~/hbase-2.0.0-beta-1$ bin/stop-hbase.sh 
      stopping hbase..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
      SLF4J: Class path contains multiple SLF4J bindings.
      SLF4J: Found binding in [jar:file:/home/hbase/hbase-2.0.0-beta-1/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in [jar:file:/home/hbase/hbase-2.0.0-beta-1/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
      

      The most interesting is that it seems to fail the same way even if the DN is declared dead on HDFS side:

      515692041367/node8.distparser.com%2C16020%2C1515692041367.1515692057716 failed, retry = 4
      org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connexion refusée: /192.168.23.2:50010
      	at org.apache.hbase.thirdparty.io.netty.channel.unix.Socket.finishConnect(..)(Unknown Source)
      Caused by: org.apache.hbase.thirdparty.io.netty.channel.unix.Errors$NativeConnectException: syscall:getsockopt(..) failed: Connexion refusée
      	... 1 more
      

      Attachments

        1. HBASE-19768.patch
          23 kB
          Duo Zhang

        Activity

          People

            zhangduo Duo Zhang
            jmspaggi Jean-Marc Spaggiari
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: