Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2540

Empty blocks make fsck report corrupt, even when it isn't

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.15.1
    • 0.15.3
    • None
    • None

    Description

      If the name node crashes after blocks have been allocated and before the content has been uploaded, fsck will report the zero sized files as corrupt upon restart:

      /user/rajive/rand0/_task_200712121358_0001_m_000808_0/part-00808: MISSING 1 blocks of total size 0 B

      ... even though all blocks are accounted for:

      Status: CORRUPT
      Total size: 2932802658847 B
      Total blocks: 26603 (avg. block size 110243305 B)
      Total dirs: 419
      Total files: 5031
      Over-replicated blocks: 197 (0.740518 %)
      Under-replicated blocks: 0 (0.0 %)
      Target replication factor: 3
      Real replication factor: 3.0074053

      The filesystem under path '/' is CORRUPT

      In UFS and related filesystems, such files would get put into lost+found after an fsck and the filesystem would return back to normal. It would be super if HDFS could do a similar thing. Perhaps if all of the nodes stored in the name node's 'includes' file have reported in, HDFS could automatically run a fsck and store these not-necessarily-broken files in something like lost+found.

      Files that are actually missing blocks, however, should not be touched.

      Attachments

        1. recoverLastBlock.patch
          7 kB
          Dhruba Borthakur
        2. recoverLastBlock2.patch
          8 kB
          Dhruba Borthakur

        Issue Links

          Activity

            People

              dhruba Dhruba Borthakur
              aw Allen Wittenauer
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: