Details

Type: Bug
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: 4.0.x, 5.x
Component/s: Cluster/Gossip, Consistency/Coordination
Labels:
None

Bug Category:
Degradation - Other Exception
Severity:
Normal
Complexity:
Challenging
Discovered By:
User Report
Platform:

All
Impacts:

None

Description

Clusters that contain prepared statements that partially select static columns before the upgrade will fail to execute those statements coordinated from the 4.x nodes until the upgrade completes.

Reproduction

Setup (before upgrade)

CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor':3}
CREATE TABLE ks1.tbl1 (pk1 int,
ck2 int,
s3 int static,
s4 int static,
c5 int,
PRIMARY KEY (pk1, ck2));
INSERT INTO ks1.tbl1 (pk1, ck2, s3, s4, c5) VALUES (1, 2, 3, 4, 5);

Prepared Statement (prepare before upgrade)

SELECT c5, s3 FROM ks1.tbl1 WHERE pk1 = ? AND ck2 = ?;

Exception on 3.0.x nodes (when executing prepared statement after upgrade)

java.lang.IllegalStateException: [s3, s4] is not a subset of [s3] at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:566)
at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:498) at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:235)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:209)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:141)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:129)
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:95)
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:80)
at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:191)
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:181)
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:177)
at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48)
at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:335)
at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)

Exception on 4.0.x nodes (when executing prepared statement after upgrade)

java.lang.IllegalStateException: [ColumnDefinition{name=s3, type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1},
ColumnDefinition{name=s4, type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1}] is not a subset of [s3]
at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:555)
at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:487)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:216)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:190)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:121)
at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:109)
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:94)
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:326)
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:186)
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:179)
at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:175)
at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:75)
at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:499)
at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:194)
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:137)
at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:167)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:122) at java.lang.Thread.run(Thread.java:748)

The root cause is ~~CASSANDRA-16686~~ changes ColumnFilters to build and deserialize based on what versions the coordinating node thinks are running in the cluster, and that
knowledge is always incorrect when statements are reprepared on startup and may be incorrect as all nodes reach their final version.

Sequence of events:

Prepared statements are persisted in system.prepared_statements to be re-prepared on future startup.

When the 4.x node starts up after upgrade, in org.apache.cassandra.service.CassandraDaemon#setup it calls QueryProcessor.instance.preloadPreparedStatements before the Gossiper is started by a call to StorageService.instance.initServer() later in setup.

As part of preparing statements, when possible a ColumnFilterFactory is created that returns a ColumnFilter built at the time the query is prepared.

After the changes from ~~CASSANDRA-16686~~, the ColumnFilter builder constructs different column filter variants depending on the lowest version reported in gossip by checking org.apache.cassandra.gms.Gossiper#upgradeFromVersionMemoized. If this runs before the Gossiper is enabled the SystemKeyspace.CURRENT_VERSION, causing the ColumnFilter to create a column filter as if the cluster were fully upgraded.

For the query above, the ColumnFilter creates an ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter.

The 3.0.x nodes participating do not understand the new flag and creates a ColumnFilter the equivalent of a WildcardColumnFilter. The 4.x nodes participating do understand the new flag, however the deserializer takes the lower than 3.4 path as other 3.0 nodes are known about and creates a WildcardColumFilter.

The fetchedColumns sent by the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter only contains the queried static columns, however the pre-3.4 sstable iterator returns all regular and static columns, causing an IllegalStateException when the serialized response is sent back.

The ISE clears once all nodes in the cluster think they are upgraded to the current version and behave as the originally prepared query intended.

Proposed fix

In discussion with ifesdjeen, he suggested that the one way to resolve this is the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter should by deprecated (or just removed) and no longer built, always selecting all static columns
This would just leave WildCardColumnFilter and SelectionColumnFilter with ALL_COLUMNS or ONLY_QUERIED_COLUMNS.

This is a potential performance regression for unusual schemas with very large numbers of static columns, but seems unlikely in practice.

/cc: blerer

Attachments

Issue Links

is duplicated by

CASSANDRA-19751 IllegalStateException when query on table having static columns during the Cassandra cluster upgrade from 3.11.4 to 4.0.11

Resolved

IllegalStateException with prepared queries selecting static columns in mixed 3.0.x/4.x clusters

Details

Description

Reproduction

Sequence of events:

Related Problems

Proposed fix

Attachments

Issue Links

Activity

People

Dates