Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12994

TestReconstructStripedFile.testNNSendsErasureCodingTasks fails due to socket timeout

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.1.0, 3.0.1
    • erasure-coding
    • None

    Description

      Occasionally, testNNSendsErasureCodingTasks fails due to socket timeout

      2017-12-26 20:35:19,961 [StripedBlockReconstruction-0] INFO  datanode.DataNode (StripedBlockReader.java:createBlockReader(132)) - Exception while creating remote block reader, datanode 127.0.0.1:34145
      java.net.ConnectException: Connection refused
              at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
              at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
              at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
              at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
              at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.newConnectedPeer(StripedBlockReader.java:148)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.createBlockReader(StripedBlockReader.java:123)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReader.<init>(StripedBlockReader.java:83)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.createReader(StripedReader.java:169)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.initReaders(StripedReader.java:150)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.init(StripedReader.java:133)
              at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:56)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:748)
      

      while the target datanode is removed in the test:

      2017-12-26 20:35:18,710 [Thread-2393] INFO  net.NetworkTopology (NetworkTopology.java:remove(219)) - Removing a node: /default-rack/127.0.0.1:34145
      

      Attachments

        1. HDFS-12994.00.patch
          1 kB
          Lei (Eddy) Xu
        2. HDFS-12994.01.patch
          2 kB
          Lei (Eddy) Xu

        Activity

          People

            eddyxu Lei (Eddy) Xu
            eddyxu Lei (Eddy) Xu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: