Details
-
Bug
-
Status: Open
-
Low
-
Resolution: Unresolved
-
None
-
Low
Description
When we start a repair on a node, the information is written to system_distributed.repair_history. If the node running it happens to be a parent (the one holding the repair session) and it dies, the entries for the repair that was running will be stuck in "STARTED" state without being updated.
To resolve this, the node should check on start whether it was a parent before crash/restart, and if there are entries in the table (and in system_distributed.parent_repair_history too), and mark those entries as FAILED.