Details
-
Task
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Operability
-
Normal
-
All
-
None
-
Description
Currently, when a node is being decommissioned and if any failure happens, then an exception is thrown back to the caller.
But Cassandra's decommission takes considerable time ranging from minutes to hours to days. There are various scenarios in that the caller may need to probe the status again:
- The caller times out
- It is not possible to keep the caller hanging for such a long time
And If the caller does not know what happened internally, then it cannot retry, etc., leading to other issues.
So, in this ticket, I am going to add a new nodetool/JMX command that can be invoked by the caller anytime, and it will return the correct status.
It might look like a smaller change, but when we need to operate Cassandra at scale in a large-scale fleet, then this becomes a bottleneck and require constant operator intervention.
Attachments
Issue Links
- relates to
-
CASSANDRA-18749 Expose bootstrap failure state via JMX
- Resolved
- links to