Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-15646 Track failing tests in HDFS
  3. HDFS-7527

TestDecommission.testIncludeByRegistrationName fails intermittently

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • namenode, test

    Description

      https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/

      Error Message

      test timed out after 360000 milliseconds
      Stacktrace

      java.lang.Exception: test timed out after 360000 milliseconds
      at java.lang.Thread.sleep(Native Method)
      at org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)

      2014-12-15 12:00:19,958 ERROR datanode.DataNode (BPServiceActor.java:run(836)) - Initialization failed for Block pool BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to localhost/127.0.0.1:40565 Datanode denied communication with namenode because the host is not in the include-list: DatanodeRegistration(127.0.0.1, datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, infoSecurePort=0, ipcPort=43726, storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
      at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
      at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
      at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)

      2014-12-15 12:00:29,087 FATAL datanode.DataNode (BPServiceActor.java:run(841)) - Initialization failed for Block pool BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to localhost/127.0.0.1:40565. Exiting.
      java.io.IOException: DN shut down before block pool connected
      at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
      at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
      at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
      at java.lang.Thread.run(Thread.java:745)

      Found by tool proposed in HADOOP-11045:

      [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j Hadoop-Hdfs-trunk -n 5 | tee bt.log
      ****Recently FAILED builds in url: https://builds.apache.org//job/Hadoop-Hdfs-trunk
      THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, as listed below:

      ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport (2014-12-15 03:30:01)
      Failed test: org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
      Failed test: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
      Failed test: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
      ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport (2014-12-13 10:32:27)
      Failed test: org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
      ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport (2014-12-13 03:30:01)
      Failed test: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
      ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1969/testReport (2014-12-11 03:30:01)
      Failed test: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
      Failed test: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
      Failed test: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testFailoverRightBeforeCommitSynchronization

      Among 6 runs examined, all failed tests <#failedRuns: testName>:
      3: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
      2: org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
      2: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
      1: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testFailoverRightBeforeCommitSynchronization

      Attachments

        1. HDFS-7527.001.patch
          4 kB
          Binglin Chang
        2. HDFS-7527.002.patch
          2 kB
          Binglin Chang
        3. HDFS-7527.003.patch
          2 kB
          Ajay Kumar

        Issue Links

          Activity

            People

              decster Binglin Chang
              yzhangal Yongjun Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: