Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
https://issues.apache.org/jira/browse/HDDS-8876 fixed the flakiness in testInstallIncrementalSnapshotWithFailure by commenting out the part that checks the metrics.
By uncommenting this part, the test has a 5% chance of failure. The metrics depend on picking up that
- there is a corruption on the candidate dir
- Ratis returns a result of SNAPSHOT_UNAVAILABLE
- Candidate dir is getting cleaned up
- The follower is forced to take another full snapshot
During the failed runs, the corruption isn't picked up and the Ratis InstallSnapshotResult is SUCCESS. Therefore, the snapshot installation isn't repeated.
Attachments
Issue Links
- links to