Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1158

HDFS-457 increases the chances of losing blocks

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: datanode
    • Labels:
      None

      Description

      Whenever we restart a cluster, there's a chance of losing some blocks if more than three datanodes don't come up.
      HDFS-457 increases this chance by keeping the datanodes up even when

      1. /tmp disk goes read-only
      2. /disk0 that is used for storing PID goes read-only
        and probably more.

      In our environment, /tmp and /disk0 are from the same device.

      When trying to restart a datanode, it would fail with
      1)

      2010-05-15 05:45:45,575 WARN org.mortbay.log: tmpdir
      java.io.IOException: Read-only file system
              at java.io.UnixFileSystem.createFileExclusively(Native Method)
              at java.io.File.checkAndCreate(File.java:1704)
              at java.io.File.createTempFile(File.java:1792)
              at java.io.File.createTempFile(File.java:1828)
              at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
      

      or
      2)

      hadoop-daemon.sh: line 117: /disk/0/hadoop-datanode....com.out: Read-only file system
      hadoop-daemon.sh: line 118: /disk/0/hadoop-datanode.pid: Read-only file system
      

      I can recover the missing blocks but it takes some time.

      Also, we are losing track of block movements since log directory can also go to read-only but datanode would continue running.

      For 0.21 release, can we revert HDFS-457 or make it configurable?

        Attachments

        1. rev-HDFS-457.patch
          15 kB
          Konstantin Shvachko

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                knoguchi Koji Noguchi
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: