Details
Description
Started to notice many open file descriptors (100k+) per node, growing at a rate of about 30 per minute in our cluster. After turning off our JMX exporting server(https://github.com/prometheus/jmx_exporter), which gets queried every 30 seconds, the number of file descriptors remained static.
Digging a bit further I ran a jmx dump tool over all the cassandra metrics and tracked the number of file descriptors after each query, boiling it down to a single metric causing the number of file descriptors to increase:
org.apache.cassandra.metrics:keyspace=tpsv1,name=SnapshotsSize,scope=events_by_engagement_id,type=Table
running a query a few times against this metric shows the file descriptors increasing after each query:
for _ in {0..3} do java -jar jmx-dump-0.4.2-standalone.jar --port 7199 --dump org.apache.cassandra.metrics:keyspace=tpsv1,name=SnapshotsSize,scope=events_by_engagement_id,type=Table > /dev/null; sudo lsof -p `pgrep -f CassandraDaemon` | fgrep "DIR" | awk '{a[$(NF)]+=1}END{for(k in a){print k, a[k]}}' | grep "events_by" done > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71 33176 > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71 33177 > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71 33178 > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71 33179
it should be noted that the file descriptor is open on a directory, not an actual file
Attachments
Issue Links
- is related to
-
CASSANDRA-11594 Too many open files on directories
- Resolved