Details
-
Bug
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
Low
Description
Apparently when cassandra any index that does not index a value in every live SSTable gets rebuild. The offending code can be found in the constructor of SASIIndex.
You can easilly reproduce it:
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true; CREATE TABLE test.test ( a text PRIMARY KEY, b text, c text ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE CUSTOM INDEX test_b_idx ON test.test (b) USING 'org.apache.cassandra.index.sasi.SASIIndex'; CREATE CUSTOM INDEX test_c_idx ON test.test (c) USING 'org.apache.cassandra.index.sasi.SASIIndex'; INSERT INTO test.test (a, b) VALUES ('a', 'b');
Log (I added additional traces):
INFO [main] 2016-11-28 15:32:21,191 ColumnFamilyStore.java:406 - Initializing test.test DEBUG [SSTableBatchOpen:1] 2016-11-28 15:32:21,192 SSTableReader.java:505 - Opening /mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big (0.034KiB) DEBUG [main] 2016-11-28 15:32:21,194 SASIIndex.java:118 - index: org.apache.cassandra.schema.IndexMetadata@2f661b1a[id=6b00489b-7010-396e-9348-9f32f5167f88,name=test_b_idx,kind=CUSTOM,options={class_name=org.a\ pache.cassandra.index.sasi.SASIIndex, target=b}], base CFS(Keyspace='test', ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83 INFO [main] 2016-11-28 15:32:21,194 DataTracker.java:152 - SSTableIndex.open(column: b, minTerm: value, maxTerm: value, minKey: key, maxKey: key, sstable: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test\ -229e6380b57711e68407158fde22e121/mc-1-big-Data.db')) DEBUG [main] 2016-11-28 15:32:21,195 SASIIndex.java:129 - Rebuilding SASI Indexes: {} DEBUG [main] 2016-11-28 15:32:21,195 ColumnFamilyStore.java:895 - Enqueuing flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204 Memtable.java:465 - Writing Memtable-IndexInfo@748981977(0.054KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\ 372036854775808), max(9223372036854775807)] DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204 Memtable.java:494 - Completed flushing /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db (0.035KiB) for\ commitlog position CommitLogPosition(segmentId=1480343535479, position=15652) DEBUG [MemtableFlushWriter:1] 2016-11-28 15:32:21,224 ColumnFamilyStore.java:1200 - Flushed to [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db\ ')] (1 sstables, 4.838KiB), biggest 4.838KiB, smallest 4.838KiB DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:118 - index: org.apache.cassandra.schema.IndexMetadata@12f3d291[id=45fcb286-b87a-3d18-a04b-b899a9880c91,name=test_c_idx,kind=CUSTOM,options={class_name=org.a\ pache.cassandra.index.sasi.SASIIndex, target=c}], base CFS(Keyspace='test', ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83 DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:121 - to rebuild: index: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'), sstable: org.apache.cassa\ ndra.index.sasi.conf.ColumnIndex@6cbb6b0e DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:129 - Rebuilding SASI Indexes: {BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db')={c=org.apache.cassa\ ndra.index.sasi.conf.ColumnIndex@6cbb6b0e}} DEBUG [main] 2016-11-28 15:32:21,225 ColumnFamilyStore.java:895 - Enqueuing flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235 Memtable.java:465 - Writing Memtable-IndexInfo@951411443(0.054KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\ 372036854775808), max(9223372036854775807)] DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235 Memtable.java:494 - Completed flushing /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db (0.035KiB) for\ commitlog position CommitLogPosition(segmentId=1480343535479, position=15720) DEBUG [MemtableFlushWriter:2] 2016-11-28 15:32:21,254 ColumnFamilyStore.java:1200 - Flushed to [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db\ ')] (1 sstables, 4.836KiB), biggest 4.836KiB, smallest 4.836KiB
I think a better behavior would be to ask users to explicitly rebuild indexes if they remove the files, that's fine as long as we handle correctly the case of new indexes.