Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11525

StaticTokenTreeBuilder should respect posibility of duplicate tokens

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 3.5
    • Feature/SASI
    • None
    • Cassandra 3.5-SNAPSHOT

    • Normal

    Description

      Bug reproduced in Cassandra 3.5-SNAPSHOT (after the fix of OOM)

      create table if not exists test.resource_bench ( 
       dsr_id uuid,
       rel_seq bigint,
       seq bigint,
       dsp_code varchar,
       model_code varchar,
       media_code varchar,
       transfer_code varchar,
       commercial_offer_code varchar,
       territory_code varchar,
       period_end_month_int int,
       authorized_societies_txt text,
       rel_type text,
       status text,
       dsp_release_code text,
       title text,
       contributors_name list<text>,
       unic_work text,
       paying_net_qty bigint,
      PRIMARY KEY ((dsr_id, rel_seq), seq)
      ) WITH CLUSTERING ORDER BY (seq ASC); 
      
      CREATE CUSTOM INDEX resource_period_end_month_int_idx ON test.resource_bench (period_end_month_int) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'PREFIX'};
      

      So the index is a DENSE numerical index.

      When doing the request SELECT dsp_code, unic_work, paying_net_qty FROM test.resource_bench WHERE period_end_month_int = 201401 using server-side paging.

      I bumped into this stack trace:

      WARN  [SharedPool-Worker-1] 2016-04-06 00:00:30,825 AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
      java.lang.ArrayIndexOutOfBoundsException: -55
      	at org.apache.cassandra.db.ClusteringPrefix$Serializer.deserialize(ClusteringPrefix.java:268) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:128) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.db.Serializers$2.deserialize(Serializers.java:120) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.io.sstable.IndexHelper$IndexInfo$Serializer.deserialize(IndexHelper.java:148) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.db.RowIndexEntry$Serializer.deserialize(RowIndexEntry.java:218) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.io.sstable.format.SSTableReader.keyAt(SSTableReader.java:1823) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:168) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.SSTableIndex$DecoratedKeyFetcher.apply(SSTableIndex.java:155) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:518) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.disk.TokenTree$KeyIterator.computeNext(TokenTree.java:504) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.utils.AbstractIterator.tryToComputeNext(AbstractIterator.java:116) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.utils.AbstractIterator.hasNext(AbstractIterator.java:110) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:106) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.index.sasi.plan.QueryPlan$ResultIterator.computeNext(QueryPlan.java:71) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      	at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:289) ~[apache-cassandra-3.5-SNAPSHOT.jar:3.5-SNAPSHOT]
      

      There are 2 possible root cause:

      1. Index corrupted
      2. Raw SSTable is corrupted

      To rule out scenario 1, I just drop and rebuild the index many times but the exception was still there, so I modified the method SSTableReader.keyAt(long indexPosition) to log the impacted partition:

                  try
                  {
                      if (isKeyCacheSetup())
                          cacheKey(key, rowIndexEntrySerializer.deserialize(in));
                  } catch (IndexOutOfBoundsException ex)
                  {
                      logger.error(String.format(
                      "Error when reading index entry for token '%s' at indexPosition %s ",
                      key.getToken().getTokenValue(), indexPosition));
                  }
      

      Below are the output in the log after code modification:

      system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,843 SSTableReader.java:1830 - Error when reading index entry for token '-7005474773654630139' at indexPosition 2147457128
      system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,917 SSTableReader.java:1830 - Error when reading index entry for token '-5016711186446865616' at indexPosition 2147458268
      system_ns3038406.ip-5-39-72.eu.log:ERROR [SharedPool-Worker-1] 2016-04-07 17:08:28,918 SSTableReader.java:1830 - Error when reading index entry for token '1027994831942941747' at indexPosition 2147459218
      

      I double check the original C* data using cqlsh but it seems that there is no data for those tokens:

      SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=-7005474773654630139;
      
       dsr_id | rel_seq
      --------+---------
      
      (0 rows)
       SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=-5016711186446865616;
      
       dsr_id | rel_seq
      --------+---------
      
      (0 rows)
      SELECT dsr_id,rel_seq FROM resource_bench WHERE token(dsr_id,rel_seq)=1027994831942941747;
      
       dsr_id | rel_seq
      --------+---------
      
      (0 rows)
      

      /cc xedin beobal

      Attachments

        Activity

          People

            jwest Jordan West
            doanduyhai DuyHai Doan
            Jordan West
            Pavel Yaskevich
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: