Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Degradation
-
Normal
-
Normal
-
User Report
-
All
-
None
Description
Hi
We meet an issue during repairs (but more probably compaction issue in fact) since we upgraded from 3.11.1 to 3.11.10.
We are using reaper, but the issue doesn't seem to come from it (according to adejanovski@hotmail.com ). When the problem happens, repairs driven by reaper are blocked.
Basically reaper hangs with the message "All nodes are busy or have too many pending compactions for the remaining candidate segments." and indeed one node has a lot of compaction pending tasks :
$ nodetool compactionstats pending tasks: 95 - mt_metrics.metric_32: 95
Errors in log are :
WARN [CompactionExecutor:12909] 2021-04-28 08:59:51,241 LeveledCompactionStrategy.java:144 - Could not acquire references for compacting SSTables [BigTableReader(path='/var/lib/cassandra/d .... WARN [CompactionExecutor:12909] 2021-04-28 09:00:19,484 LeveledCompactionStrategy.java:144 - Could not acquire references for compacting SSTables [BigTableReader(path='/var/lib/cassandra/d .... WARN [CompactionExecutor:12908] 2021-04-28 09:00:51,241 LeveledCompactionStrategy.java:144 - Could not acquire references for compacting SSTables [BigTableReader(path='/var/lib/cassandra/d .... WARN [CompactionExecutor:12907] 2021-04-28 08:58:51,097 LeveledCompactionStrategy.java:144 - Could not acquire references for compacting SSTables [BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350757-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350755-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350738-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350759-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350761-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350740-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/metric_32-23300de089c311e882a61bd0fd209f48/md-350751-big-Data.db'), BigTableReader(path='/var/lib/cassandra/data/mt_metrics/ ....
The error happened several times in few weeks and up to now always concerns LCS tables.
a.dejanoski mentioned me https://issues.apache.org/jira/browse/CASSANDRA-15242 but I have no trace of messages like "disk boundaries are out of date for keyspacename.tablename" or "Refreshing disk boundary cache for keyspacename.tablename".
The workaround is simple : just restart the node once it is identified. Pending compactions tasks rerun well.
We have the issue on 2 of our clusters on 3.11.10.
Does someone else met the issue ?
Attachments
Issue Links
- is fixed by
-
CASSANDRA-16552 Anticompaction appears to race with Compaction, preventing forward compaction progress after an incremental repair
-
- Resolved
-