Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2047

Datanodes fail to come up after 10 retries in a secure environment

    XMLWordPrintableJSON

Details

    Description

      10:06:36.585 PM    ERROR    HddsDatanodeService    
      Error while storing SCM signed certificate.
      java.net.ConnectException: Call From jmccarthy-ozone-secure-2.vpc.cloudera.com/10.65.50.127 to jmccarthy-ozone-secure-1.vpc.cloudera.com:9961 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
          at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
          at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
          at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:755)
          at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
          at org.apache.hadoop.ipc.Client.call(Client.java:1457)
          at org.apache.hadoop.ipc.Client.call(Client.java:1367)
          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
          at com.sun.proxy.$Proxy15.getDataNodeCertificate(Unknown Source)
          at org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.getDataNodeCertificateChain(SCMSecurityProtocolClientSideTranslatorPB.java:156)
          at org.apache.hadoop.ozone.HddsDatanodeService.getSCMSignedCert(HddsDatanodeService.java:278)
          at org.apache.hadoop.ozone.HddsDatanodeService.initializeCertificateClient(HddsDatanodeService.java:248)
          at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:211)
          at org.apache.hadoop.ozone.HddsDatanodeService.start(HddsDatanodeService.java:168)
          at org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:143)
          at org.apache.hadoop.ozone.HddsDatanodeService.call(HddsDatanodeService.java:70)
          at picocli.CommandLine.execute(CommandLine.java:1173)
          at picocli.CommandLine.access$800(CommandLine.java:141)
          at picocli.CommandLine$RunLast.handle(CommandLine.java:1367)
          at picocli.CommandLine$RunLast.handle(CommandLine.java:1335)
          at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
          at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526)
          at picocli.CommandLine.parseWithHandler(CommandLine.java:1465)
          at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65)
          at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:56)
          at org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:126)
      Caused by: java.net.ConnectException: Connection refused
          at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
          at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
          at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
          at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
          at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
          at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
          at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
          at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
          at org.apache.hadoop.ipc.Client.call(Client.java:1403)
          ... 21 more
      

      Datanodes try to get SCM signed certificate for just 10 times with interval of 1 sec. When SCM takes a little longer to come up, datanodes throw an exception and fail.

      Attachments

        Issue Links

          Activity

            People

              xyao Xiaoyu Yao
              vivekratnavel Vivek Ratnavel Subramanian
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m