Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2713

Unit test fails on Windows: org.apache.hadoop.dfs.TestDatanodeDeath

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.16.0
    • 0.16.0
    • None
    • None
    • Windows

    Description

      Unit test fails consistently on Windows with a timeout:

      Test: org.apache.hadoop.dfs.TestDatanodeDeath

      Here is a snippet of the console log:
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] 2008-01-25 09:10:47,841 WARN fs.FSNamesystem (PendingReplicationBlocks.java:pendingReplicationCheck(209)) - PendingReplicationMonitor timed out block blk_2509851293741663991
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] 2008-01-25 09:10:52,839 INFO dfs.StateChange (FSNamesystem.java:pendingTransfers(3249)) - BLOCK* NameSystem.pendingTransfer: ask 127.0.0.1:3773 to replicate blk_2509851293741663991 to datanode(s) 127.0.0.1:3767
      [junit] 2008-01-25 09:10:53,526 INFO dfs.DataNode (DataNode.java:transferBlocks(786)) - 127.0.0.1:3773 Starting thread to transfer block blk_2509851293741663991 to 127.0.0.1:3767
      [junit] 2008-01-25 09:10:53,526 INFO dfs.DataNode (DataNode.java:writeBlock(1035)) - Receiving block blk_2509851293741663991 from /127.0.0.1
      [junit] 2008-01-25 09:10:53,526 INFO dfs.DataNode (DataNode.java:writeBlock(1147)) - writeBlock blk_2509851293741663991 received exception java.io.IOException: Block blk_2509851293741663991 has already been started (though not completed), and thus cannot be created.
      [junit] 2008-01-25 09:10:53,526 ERROR dfs.DataNode (DataNode.java:run(948)) - 127.0.0.1:3767:DataXceiver: java.io.IOException: Block blk_2509851293741663991 has already been started (though not completed), and thus cannot be created.
      [junit] at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:638)
      [junit] at org.apache.hadoop.dfs.DataNode$BlockReceiver.<init>(DataNode.java:1949)
      [junit] at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1060)
      [junit] at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:925)
      [junit] at java.lang.Thread.run(Thread.java:595)

      [junit] 2008-01-25 09:10:53,526 WARN dfs.DataNode (DataNode.java:run(2366)) - 127.0.0.1:3773:Failed to transfer blk_2509851293741663991 to 127.0.0.1:3767 got java.net.SocketException: Software caused connection abort: socket write error
      [junit] at java.net.SocketOutputStream.socketWrite0(Native Method)
      [junit] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
      [junit] at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
      [junit] at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
      [junit] at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
      [junit] at java.io.DataOutputStream.flush(DataOutputStream.java:106)
      [junit] at org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1621)
      [junit] at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java:2360)
      [junit] at java.lang.Thread.run(Thread.java:595)

      [junit] File simpletest.dat has 3 blocks: The 0 block has only 2 replicas but is expected to have 3 replicas.
      [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
      [junit] Test org.apache.hadoop.dfs.TestDatanodeDeath FAILED (timeout)

      Attachments

        1. TestDatanodeDeath.patch
          5 kB
          Dhruba Borthakur
        2. TestDatanodeDeath.patch
          3 kB
          Dhruba Borthakur
        3. TestDatanodeDeath.patch
          1 kB
          Dhruba Borthakur

        Activity

          People

            dhruba Dhruba Borthakur
            mukundm Mukund Madhugiri
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: