Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-1103

Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Invalid
    • 0.21.0, 0.22.0, 0.23.0, 2.0.0-alpha
    • None
    • datanode
    • None

    Description

      When the DN creates a replica under recovery, it calls validateIntegrity, which truncates the last checksum chunk off of a replica if it is found to be invalid. Then when the block recovery process happens, this shortened block wins over a longer replica from another node where there was no corruption. Thus, if just one of the DNs has an invalid last checksum chunk, data that has been sync()ed to other datanodes can be lost.

      Attachments

        1. hdfs-1103-test.txt
          15 kB
          Todd Lipcon

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tlipcon Todd Lipcon
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: