Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15529

AbstractLocalAwareExecutorService.java exceptions after upgrade from 2.1.16 to 3.11.4

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Urgent
    • Resolution: Unresolved
    • None
    • Cluster/Schema
    • Correctness - Recoverable Corruption / Loss
    • Critical
    • Challenging
    • User Report
    • OpenJDK, Linux

    Description

      Hello Team, 

      We have cluster running on cassandra 3.11.4

      Following is the table schema of the tables that is being used in our system.

      cqlsh> desc KEYSPACE "SAL"
        
        CREATE KEYSPACE "SAL" WITH replication = {'class': 'NetworkTopologyStrategy', 'DC_EAST': '3', 'DC_WEST': '3'}  AND durable_writes = true;
        
        CREATE TABLE "SAL".sal_purge (
            key text,
            column1 text,
            column2 text,
            value text,
            PRIMARY KEY (key, column1, column2)
        ) WITH COMPACT STORAGE
            AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
            AND bloom_filter_fp_chance = 0.1
            AND caching = '{"keys":"NONE", "rows_per_partition":"NONE"}'
            AND comment = 'Holds items to be removed as [shardid][salid][timestamp]. The table records SALIDs to be deleted along with their deletion times (which may be modified)'
            AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
            AND compression = {'chunk_length_kb': '64', 'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'}
            AND dclocal_read_repair_chance = 0.0
            AND default_time_to_live = 0
            AND gc_grace_seconds = 864000
            AND max_index_interval = 2048
            AND memtable_flush_period_in_ms = 0
            AND min_index_interval = 128
            AND read_repair_chance = 0.1
            AND speculative_retry = '99.0PERCENTILE';
        
        CREATE TABLE "SAL".sal_ref (
            key text,
            column1 text,
            column2 text,
            value text,
            PRIMARY KEY (key, column1, column2)
        ) WITH COMPACT STORAGE
            AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
            AND bloom_filter_fp_chance = 0.025
            AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
            AND comment = 'Holds owner references to content as [salid][lcid/opid]'
            AND compaction = {'sstable_size_in_mb': '180', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
            AND compression = {'chunk_length_kb': '64', 'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'}
            AND dclocal_read_repair_chance = 0.0
            AND default_time_to_live = 0
            AND gc_grace_seconds = 864000
            AND max_index_interval = 2048
            AND memtable_flush_period_in_ms = 0
            AND min_index_interval = 128
            AND read_repair_chance = 0.0
            AND speculative_retry = '99.0PERCENTILE';
      
      

      Things to note:

      1. The column2 is always passed a null value during insertion 
      2. column2 is a part of primary key
      3.  Range select and Range delete is done through our app.    

      Iniatally the cluster was on casssandra version 2.1.16  and have been recently upgraded to 3.11.4 post the upgrade, we see that the nodes are going down, and log the below exceptions during startup and even after node is up. This one node is causing the whole cluster to behave improperly.

      WARN [Native-Transport-Requests-47] 2020-01-29 13:49:05,190 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[Native-Transport-Requests-47,5,main]: {} java.lang.RuntimeException: java.lang.IllegalStateException: UnfilteredRowIterator for SAL.sal_purge has an open RT bound as its last item at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2588) ~[apache-cassandra-3.11.4.jar:3.11.4] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0-internal] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.concurrent.SEPExecutor.maybeExecuteImmediately(SEPExecutor.java:194) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:117) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:85) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.AbstractReadExecutor$SpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:271) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1778) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1731) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1671) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1586) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:1209) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:315) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:285) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:117) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:225) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:532) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:509) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:146) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.4.jar:3.11.4] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0-internal] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) [apache-cassandra-3.11.4.jar:3.11.4] at java.lang.Thread.run(Thread.java:748) [na:1.8.0-internal] Caused by: java.lang.IllegalStateException: UnfilteredRowIterator for SAL.sal_purge has an open RT bound as its last item at org.apache.cassandra.db.transform.RTBoundCloser$RowsTransformation.moreContents(RTBoundCloser.java:109) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.RTBoundCloser$RowsTransformation.moreContents(RTBoundCloser.java:63) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.BaseIterator.tryGetMoreContents(BaseIterator.java:121) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.BaseIterator.hasMoreContents(BaseIterator.java:111) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:159) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:136) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:187) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:180) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:176) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:353) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1876) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2584) ~[apache-cassandra-3.11.4.jar:3.11.4] ... 29 common frames omitted Suppressed: java.lang.IllegalStateException: PROCESSED UnfilteredRowIterator for SAL.sal_purge has an illegal RT bounds sequence: expected all RTs to be closed, but the last one is open at org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:120) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.onPartitionClose(RTBoundValidator.java:113) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.BaseRows.runOnClose(BaseRows.java:91) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:86) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:309) ~[apache-cassandra-3.11.4.jar:3.11.4] ... 36 common frames omitted
      

      Attachments

        1. sstable_dump.txt
          4 kB
          Pooja Nair

        Activity

          People

            Unassigned Unassigned
            Pooja_nair Pooja Nair
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: