Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13900

NameNode: Unable to trigger a roll of the active NN

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • None
    • None
    • None
    • None

    Description

      I have backport Multi-standby NNs to our own hdfs version. I found an issue of EditLog roll.

      Reproducible Steps:

      1.original state

      nn1 active

      nn2 standby

      nn3 standby

      2. stop nn1

      3. new state

      nn1 stopped

      nn2 active

      nn3 standby

      4. nn3 unable to trigger a roll of the active NN

      [2018-08-22T10:33:38.025+08:00] [WARN] namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java 307) [Edit log tailer] : Unable to trigger a roll of the active NN
      java.net.ConnectException: Call From <nn3 hostname> to <nn1 hostname> failed on connection exception: java.net.ConnectException: Connection refused; For more details see:http://wiki.apache.org/hadoop/ConnectionRefused
      at sun.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
      at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:782)
      at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:722)
      at org.apache.hadoop.ipc.Client.call(Client.java:1536)
      at org.apache.hadoop.ipc.Client.call(Client.java:1463)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:237)
      at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
      at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:301)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$2.doWork(EditLogTailer.java:298)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$MultipleNameNodeProxy.call(EditLogTailer.java:414)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:304)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$800(EditLogTailer.java:69)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:346)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:315)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:332)
      at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
      at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:328)
      Caused by: java.net.ConnectException: Connection refused
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
      at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
      at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:521)
      at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:485)
      at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:658)
      at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
      at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:419)
      at org.apache.hadoop.ipc.Client.getConnection(Client.java:1585)
      at org.apache.hadoop.ipc.Client.call(Client.java:1502)
      ... 14 more

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              liuhongtong liuhongtong
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: