Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11617

Datanode should delete the block from rbw directory when it finds duplicate in finalized directory.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.7.3
    • None
    • datanode
    • None

    Description

      Recently we had power failure event and we hit HDFS-5042.
      There were missing blocks but datanode had the copy of the block (and meta file) in rbw directory.
      I manually copied the block and meta file to finalized directory and restarted the datanode.
      But after restart, the block somehow got deleted from the finalized directory.
      So I think the datanode tried to resolve duplicate replicas and in process of resolving it deleted the replica from finalized directory.
      In my opinion, if we have to choose between rbw replica and finalized replica (assuming size and genstamp are same), we should delete rbw replica, not finalized replica.

      Attachments

        Activity

          People

            Unassigned Unassigned
            shahrs87 Rushabh Shah
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: