Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Information Provided
-
None
-
None
-
None
-
Fedora 24 / Java 1.8.0_91 / Cassandra 3.0.9
Mac OS X 10.11.6 / Java 1.8.0_102 / Cassandra 3.0.9
-
Normal
Description
I have running some tests on a monitoring system I work on and Cassandra is consistently crashing with OOMEs, and the JVM exists. This is happening in a dev environment with a single node created with ccm.
The monitoring server is ingesting 4,000 data points every 10 seconds. Every two hours a job runs which fetches all raw data from the past two hours. The raw data is compressed, written to another table, and then deleted. After 3 or 4 runs of the job Cassandra crashes. Initially I thought that the problem was in my application code, but I am no longer of that opinion because I set up the same test environment with Cassandra 3.9, and it has been running for almost 48 hours without error. And I actually increased the load on the 3.9 environment.
The schema for the raw data which is queried looks like:
CREATE TABLE hawkular_metrics.data ( tenant_id text, type tinyint, metric text, dpart bigint, time timeuuid, n_value double, tags map<text, text>, PRIMARY KEY ((tenant_id, type, metric, dpart), time) ) WITH CLUSTERING ORDER BY (time DESC)
And the schema for the table that is written to:
CREATE TABLE hawkular_metrics.data_compressed ( tenant_id text, type tinyint, metric text, dpart bigint, time timestamp, c_value blob, tags blob, ts_value blob, PRIMARY KEY ((tenant_id, type, metric, dpart), time) ) WITH CLUSTERING ORDER BY (time DESC)
I am using version 3.0.1 of the DataStax Java driver. Last night I changed the driver's page size from the default to 1000, and so far I have not yet seen any errors.
I have attached the log file. I was going to attach one of the heap dumps, but it looks like they are too big.