Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
Description
EC Reconstruction(with Write Chunk Operation) recreates open with replica Index 0 when StaleRecoveringContainerScrubbingService deletes the recovering container. Thus an invalid container with replica 0 is created. This could potentially cause SCM failure when container is reported with heartbeat & also partial reconstructed container when a new block is written simultaneously with recovering container being deleted.
Marking the recovering container as unhealthy should fix the issue. Handling the failure to delete unhealthy container should fix the issue from Reconstruction Coordinater will cleanup the stale container.