Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Currently, in the retry path, once a datanode goes down , the data from last acknowledged length will be retried on a new pipeline with new set of datanodes. Secondly, once a block write fails in between , a new block is allocated for the remaining unacknowledged data.
In HDFS, in case of a datanode failure, a new datanode is recruited in case of a dn failure and the packets are only written for the replcaed datanode. Also, the same block gets written out and there is no new block allocation. In that way, the key/file metadata remains same but in ozone, it may bloat up the OM metadata.
This Jira is to discuss any optimizations needed in ozone retry path to improve performance if any,