Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-128

Failure to replicate dfs block kills client

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.1
    • Fix Version/s: 0.2.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      ~200 node linux cluster (kernel 2.6, redhat, 2 hyper threaded cpus)

      Description

      When the datanode gets an exception, which is logged as:

      060407 155835 13 DataXCeiver
      java.io.EOFException
      at java.io.DataInputStream.readFully(DataInputStream.java:178)
      at java.io.DataInputStream.readLong(DataInputStream.java:380)
      at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:462)
      at java.lang.Thread.run(Thread.java:595)

      It closes the user's connection to the data node, which causes the client to get an IOException from:

      at java.io.DataInputStream.readFully(DataInputStream.java:178)
      at java.io.DataInputStream.readLong(DataInputStream.java:380)
      at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.internalClose(DFSClient.java:883)

        Attachments

        1. datanode-mirroring.patch
          31 kB
          Owen O'Malley
        2. datanode.no-ws-diff
          10 kB
          Owen O'Malley
        3. conf.patch
          0.6 kB
          Owen O'Malley

          Activity

            People

            • Assignee:
              owen.omalley Owen O'Malley
              Reporter:
              owen.omalley Owen O'Malley
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: