Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10730

Fix some failed tests due to BindException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha2
    • None
    • None
    • Reviewed

    Description

      In HDFS-10723, kihwal suggested that

      it is not a good idea to hard-code or reuse the same port number in unit tests. Because the jenkins slave can run multiple jobs at the same time.

      Then I collected some tests which failed by this reason in recent jenkin buildings.
      Finally I found these two failed test TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(https://builds.apache.org/job/PreCommit-HDFS-Build/16301/testReport/) and TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup(https://builds.apache.org/job/PreCommit-HDFS-Build/16257/testReport/).

      The stack infos:

      java.net.BindException: Problem binding to [localhost:57241] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
      	at sun.nio.ch.Net.bind0(Native Method)
      	at sun.nio.ch.Net.bind(Net.java:433)
      	at sun.nio.ch.Net.bind(Net.java:425)
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
      	at org.apache.hadoop.ipc.Server.bind(Server.java:538)
      	at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:811)
      	at org.apache.hadoop.ipc.Server.<init>(Server.java:2611)
      	at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:958)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:562)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:537)
      	at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:953)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1361)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
      	at org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
      	at org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2298)
      	at org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2278)
      	at org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:482)
      	at org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(TestFileChecksum.java:182)
      
      java.net.BindException: Problem binding to [localhost:54191] java.net.BindException: Address already in use; For more details see:  http://wiki.apache.org/hadoop/BindException
      	at sun.nio.ch.Net.bind0(Native Method)
      	at sun.nio.ch.Net.bind(Net.java:433)
      	at sun.nio.ch.Net.bind(Net.java:425)
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
      	at org.apache.hadoop.ipc.Server.bind(Server.java:530)
      	at org.apache.hadoop.ipc.Server.bind(Server.java:519)
      	at org.apache.hadoop.hdfs.net.TcpPeerServer.<init>(TcpPeerServer.java:52)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1082)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1348)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
      	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
      	at org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
      	at org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup(TestDecommissionWithStriped.java:255)
      

      We can make a change to update the param value for keepPort from

      cluster.restartDataNode(dnp, true);
      

      to

      cluster.restartDataNode(dnp, false);
      

      Attachments

        1. HDFS-10730.001.patch
          2 kB
          Yiqun Lin
        2. HDFS-10730.002.patch
          2 kB
          Yiqun Lin

        Activity

          People

            linyiqun Yiqun Lin
            linyiqun Yiqun Lin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: