Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-4600

HDFS file append failing in multinode cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.0.3-alpha
    • None
    • None
    • None

    Description

      NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml are attached)

      Steps to reproduce:

      $ javac -cp /usr/lib/hadoop/client/\* X.java
      $ echo aaaaa > a.txt
      $ hadoop fs -ls /tmp/a.txt
      ls: `/tmp/a.txt': No such file or directory
      $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt
      13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception
      java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
      Exception in thread "main" java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
      13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt
      java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
      	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
      

      Given that the file actually does get created:

      $ hadoop fs -ls /tmp/a.txt
      Found 1 items
      -rw-r--r--   3 root hadoop          6 2013-03-13 16:05 /tmp/a.txt
      

      this feels like a regression in APPEND's functionality.

      Attachments

        1. X.java
          2 kB
          Roman Shaposhnik
        2. hdfs-site.xml
          2 kB
          Roman Shaposhnik
        3. core-site.xml
          2 kB
          Roman Shaposhnik

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rvs Roman Shaposhnik
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: