Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
The issue involves datanodes in Ozone reporting negative container sizes for the usedBytes and block count metrics. This occurs when the Ozone Manager sends duplicate block deletion requests to the Storage Container Manager. Due to a delay in processing the original request, OM may mistakenly send a duplicate request. The datanode, upon receiving the duplicate request, attempts to delete blocks that have already been deleted, but still updates the metrics, leading to negative values. The proposed solution is to modify the deletion process in the datanode to track and ignore duplicate block deletion requests, ensuring metrics are not updated incorrectly.
Recon Reported the following negative sized containers:-
sh-4.2$ ozone admin container list | jq '. | {state: .state, containerID: .containerID, usedBytes: .usedBytes}' { "state": "DELETED", "containerID": 1, "usedBytes": -100000000 } { "state": "DELETED", "containerID": 2, "usedBytes": -95420416 } { "state": "DELETED", "containerID": 3, "usedBytes": -97517568 }
Attachments
Issue Links
- links to