Details
-
Improvement
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
None
Description
We're missing metrics for repair, especially for errors. From what I observed now, the exception will be caught by UncaughtExceptionHandler set in CassandraDaemon and is categorized as StorageMetrics.exceptions. This is one example:
ERROR [AntiEntropyStage:1] 2017-03-27 18:17:08,385 CassandraDaemon.java:207 - Exception in thread Thread[AntiEntropyStage:1,5,main] java.lang.RuntimeException: Parent repair session with id = 8c85d260-1319-11e7-82a2-25090a89015f has failed. at org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:377) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.service.ActiveRepairService.removeParentRepairSession(ActiveRepairService.java:392) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:172) ~[apache-cassandra-3.0.10.jar:3.0.10] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) ~[apache-cassandra-3.0.10.jar:3.0.10] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_121] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]