Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11617

Datanode should delete the block from rbw directory when it finds duplicate in finalized directory.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.7.3
    • Fix Version/s: None
    • Component/s: datanode
    • Labels:
      None
    • Target Version/s:

      Description

      Recently we had power failure event and we hit HDFS-5042.
      There were missing blocks but datanode had the copy of the block (and meta file) in rbw directory.
      I manually copied the block and meta file to finalized directory and restarted the datanode.
      But after restart, the block somehow got deleted from the finalized directory.
      So I think the datanode tried to resolve duplicate replicas and in process of resolving it deleted the replica from finalized directory.
      In my opinion, if we have to choose between rbw replica and finalized replica (assuming size and genstamp are same), we should delete rbw replica, not finalized replica.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              shahrs87 Rushabh Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: