Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5728

[Diskfull] Block recovery will fail if the metafile does not have crc for all chunks of the block

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.23.10, 2.2.0
    • 0.23.11, 2.3.0
    • datanode
    • None
    • Reviewed

    Description

      1. Client (regionsever) has opened stream to write its WAL to HDFS. This is not one time upload, data will be written slowly.
      2. One of the DataNode got diskfull ( due to some other data filled up disks)
      3. Unfortunately block was being written to only this datanode in cluster, so client write has also failed.

      4. After some time disk is made free and all processes are restarted.
      5. Now HMaster try to recover the file by calling recoverLease.
      At this time recovery was failing saying file length mismatch.

      When checked,
      actual block file length: 62484480
      Calculated block length: 62455808

      This was because, metafile was having crc for only 62455808 bytes, and it considered 62455808 as the block size.

      No matter how many times, recovery was continously failing.

      Attachments

        1. HDFS-5728.branch-0.23.patch
          6 kB
          Kihwal Lee
        2. HDFS-5728.patch
          6 kB
          Vinayakumar B
        3. HDFS-5728.patch
          6 kB
          Vinayakumar B
        4. HDFS-5728.patch
          9 kB
          Vinayakumar B

        Activity

          People

            vinayakumarb Vinayakumar B
            vinayakumarb Vinayakumar B
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: