Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1158

HDFS-457 increases the chances of losing blocks

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: datanode
    • Labels:
      None

      Description

      Whenever we restart a cluster, there's a chance of losing some blocks if more than three datanodes don't come up.
      HDFS-457 increases this chance by keeping the datanodes up even when

      1. /tmp disk goes read-only
      2. /disk0 that is used for storing PID goes read-only
        and probably more.

      In our environment, /tmp and /disk0 are from the same device.

      When trying to restart a datanode, it would fail with
      1)

      2010-05-15 05:45:45,575 WARN org.mortbay.log: tmpdir
      java.io.IOException: Read-only file system
              at java.io.UnixFileSystem.createFileExclusively(Native Method)
              at java.io.File.checkAndCreate(File.java:1704)
              at java.io.File.createTempFile(File.java:1792)
              at java.io.File.createTempFile(File.java:1828)
              at org.mortbay.jetty.webapp.WebAppContext.getTempDirectory(WebAppContext.java:745)
      

      or
      2)

      hadoop-daemon.sh: line 117: /disk/0/hadoop-datanode....com.out: Read-only file system
      hadoop-daemon.sh: line 118: /disk/0/hadoop-datanode.pid: Read-only file system
      

      I can recover the missing blocks but it takes some time.

      Also, we are losing track of block movements since log directory can also go to read-only but datanode would continue running.

      For 0.21 release, can we revert HDFS-457 or make it configurable?

      1. rev-HDFS-457.patch
        15 kB
        Konstantin Shvachko

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Unassigned
              Reporter:
              Koji Noguchi
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development