Description
We need to garbage collect deleted blocks from the Datanodes. There are two cases where we will have orphaned blocks. One is like the classical HDFS, where someone deletes a key and we need to delete the corresponding blocks.
Another case, is when someone overwrites a key – an overwrite can be treated as a delete and a new put – that means that older blocks need to be GC-ed at some point of time.
Couple of JIRAs has discussed this in one form or another – so consolidating all those discussions in this JIRA.
HDFS-11796 – needs to fix this issue for some tests to pass
HDFS-11780 – changed the old overwriting behavior to not supporting this feature for time being.
HDFS-11920 - Once again runs into this issue when user tries to put an existing key.
HDFS-11781 - delete key API in KSM only deletes the metadata – and relies on GC for Datanodes.
When we solve this issue, we should also consider 2 more aspects.
One, we support versioning in the buckets, tracking which blocks are really orphaned is something that KSM will do. So delete and overwrite at some point needs to decide how to handle versioning of buckets.
Two, If a key exists in a closed container, then it is immutable, hence the strategy of removing the key might be more complex than just talking to an open container.
cc : xyao, cheersyang, vagarychen, msingh, yuanbo, szetszwo, nandakumar131
Attachments
Attachments
Issue Links
- incorporates
-
HDFS-12362 Ozone: write deleted block to RAFT log for consensus on datanodes
- Open
-
HDFS-12195 Ozone: DeleteKey-1: KSM replies delete key request asynchronously
- Resolved
-
HDFS-12196 Ozone: DeleteKey-2: Implement block deleting service to delete stale blocks at background
- Resolved
-
HDFS-12235 Ozone: DeleteKey-3: KSM SCM block deletion message and ACK interactions
- Resolved
-
HDFS-12282 Ozone: DeleteKey-4: Block delete between SCM and DN
- Resolved
-
HDFS-12283 Ozone: DeleteKey-5: Implement SCM DeletedBlockLog
- Resolved
-
HDFS-12370 Ozone: Implement TopN container choosing policy for BlockDeletionService
- Resolved
-
HDFS-12443 Ozone: Improve SCM block deletion throttling algorithm
- Resolved