Description
Recently we had power failure event and we hit HDFS-5042.
There were missing blocks but datanode had the copy of the block (and meta file) in rbw directory.
I manually copied the block and meta file to finalized directory and restarted the datanode.
But after restart, the block somehow got deleted from the finalized directory.
So I think the datanode tried to resolve duplicate replicas and in process of resolving it deleted the replica from finalized directory.
In my opinion, if we have to choose between rbw replica and finalized replica (assuming size and genstamp are same), we should delete rbw replica, not finalized replica.