Details
-
Improvement
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
Description
If large amount on entries exists in table, querying on partitioning key and (low cardinality) secondary index is significantly slower than just on the partitioning key, even with limit 1. It seems to do a full scan on both indexes and combine and filter this result.
In the example below, there is roughly 400M records on a given node - as you can see the two queries yield the same result, with the refining secondary key condition increasing the execution time roughly 35x. It seems to me that plain filtering of the pk query result should be more efficient especially with low cardinality secondary indexes (but it might be an assumption based on my modeling approach).
[cqlsh 4.1.1 | Cassandra 2.0.14 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
CREATE TABLE fil ( nm text, dir boolean, id timeuuid, attr map<text, text>, cntt text, del boolean, exp timestamp, gid text, perm int, seg map<timeuuid, bigint>, size bigint, uid text, PRIMARY KEY ((nm), dir, id) ) WITH CLUSTERING ORDER BY (dir ASC, id DESC) AND bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND comment='table used to store file information' AND dclocal_read_repair_chance=0.100000 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.000000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'chunk_length_kb': '512', 'crc_check_chance': '0.5', 'sstable_compression': 'SnappyCompressor'}; CREATE INDEX fil_del_idx ON fil (del); cqlsh:pronto> select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from fil where nm='/dir_108/aePzC_62755108' and dir=false and del=false limit 1; id | nm | attr | seg | exp | dir | uid | gid | perm | cntt --------------------------------------+-------------------------+------+-----------------------------------------------+--------------------------+-------+-----------+------------+------+-------------------------- e22d17f3-7f77-11e5-8313-0522d849d63b | /dir_108/aePzC_62755108 | null | {e23d6ba0-7f77-11e5-8313-0522d849d63b: 22528} | 2016-10-29 19:34:17-0700 | False | test_user | test_group | 1911 | application/octet-stream (1 rows) Tracing session: e0287850-88cd-11e5-a7d5-2360c9ef4b33 activity | timestamp | source | source_elapsed ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+---------------- execute_cql3_query | 15:42:31,643 | 10.65.230.23 | 0 Parsing select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from fil where nm='/dir_108/aePzC_62755108' and dir=false and del=false limit 1; | 15:42:31,643 | 10.65.230.23 | 105 Preparing statement | 15:42:31,644 | 10.65.230.23 | 332 Determining replicas to query | 15:42:31,644 | 10.65.230.23 | 607 Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=64656c, validator=org.apache.cassandra.db.marshal.BooleanType, type=REGULAR, componentIndex=2, indexName=fil_del_idx, indexType=COMPOSITES}]}:43707902. Scanning with fil.fil_del_idx. | 15:42:31,644 | 10.65.230.23 | 1031 Executing indexed scan for [/dir_108/aePzC_62755108, /dir_108/aePzC_62755108] | 15:42:31,644 | 10.65.230.23 | 1077 Candidate index mean cardinalities are CompositesIndexOnRegular{columnDefs=[ColumnDefinition{name=64656c, validator=org.apache.cassandra.db.marshal.BooleanType, type=REGULAR, componentIndex=2, indexName=fil_del_idx, indexType=COMPOSITES}]}:43707902. Scanning with fil.fil_del_idx. | 15:42:31,644 | 10.65.230.23 | 1157 Executing single-partition query on fil.fil_del_idx | 15:42:31,644 | 10.65.230.23 | 1221 Acquiring sstable references | 15:42:31,644 | 10.65.230.23 | 1239 Merging memtable tombstones | 15:42:31,645 | 10.65.230.23 | 1323 Partition index with 1675 entries found for sstable 3675 | 15:42:31,654 | 10.65.230.23 | 10309 Seeking to partition indexed section in data file | 15:42:31,654 | 10.65.230.23 | 10325 Partition index with 2132 entries found for sstable 3582 | 15:42:31,657 | 10.65.230.23 | 13605 Seeking to partition indexed section in data file | 15:42:31,657 | 10.65.230.23 | 13619 Partition index with 1800 entries found for sstable 3565 | 15:42:31,661 | 10.65.230.23 | 17657 Seeking to partition indexed section in data file | 15:42:31,661 | 10.65.230.23 | 17698 Partition index with 28922 entries found for sstable 3553 | 15:42:31,723 | 10.65.230.23 | 79557 Seeking to partition indexed section in data file | 15:42:31,723 | 10.65.230.23 | 79612 Partition index with 32092 entries found for sstable 3359 | 15:42:31,779 | 10.65.230.23 | 135423 Seeking to partition indexed section in data file | 15:42:31,779 | 10.65.230.23 | 135484 Partition index with 32148 entries found for sstable 3097 | 15:42:31,858 | 10.65.230.23 | 214790 Seeking to partition indexed section in data file | 15:42:31,858 | 10.65.230.23 | 214856 Partition index with 127470 entries found for sstable 2811 | 15:42:32,040 | 10.65.230.23 | 397123 Seeking to partition indexed section in data file | 15:42:32,040 | 10.65.230.23 | 397196 Partition index with 136232 entries found for sstable 1769 | 15:42:32,286 | 10.65.230.23 | 642508 Seeking to partition indexed section in data file | 15:42:32,286 | 10.65.230.23 | 642577 Partition index with 15131 entries found for sstable 938 | 15:42:32,320 | 10.65.230.23 | 676574 Seeking to partition indexed section in data file | 15:42:32,320 | 10.65.230.23 | 676613 Skipped 0/9 non-slice-intersecting sstables, included 0 due to tombstones | 15:42:32,320 | 10.65.230.23 | 676646 Merging data from memtables and 9 sstables | 15:42:32,320 | 10.65.230.23 | 676660 Read 1 live and 0 tombstoned cells | 15:42:32,351 | 10.65.230.23 | 707674 Executing single-partition query on fil | 15:42:32,351 | 10.65.230.23 | 707862 Acquiring sstable references | 15:42:32,351 | 10.65.230.23 | 707872 Merging memtable tombstones | 15:42:32,351 | 10.65.230.23 | 707893 Bloom filter allows skipping sstable 3391 | 15:42:32,351 | 10.65.230.23 | 707973 Bloom filter allows skipping sstable 3387 | 15:42:32,351 | 10.65.230.23 | 707992 Partition index with 0 entries found for sstable 3385 | 15:42:32,351 | 10.65.230.23 | 708071 Seeking to partition indexed section in data file | 15:42:32,351 | 10.65.230.23 | 708079 Skipped 10/13 non-slice-intersecting sstables, included 0 due to tombstones | 15:42:32,352 | 10.65.230.23 | 709028 Merging data from memtables and 1 sstables | 15:42:32,352 | 10.65.230.23 | 709045 Read 1 live and 0 tombstoned cells | 15:42:32,352 | 10.65.230.23 | 709090 Scanned 1 rows and matched 1 | 15:42:32,352 | 10.65.230.23 | 709211 Request complete | 15:42:32,352 | 10.65.230.23 | 709492 cqlsh:pronto> select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from fil where nm='/dir_108/aePzC_62755108' and dir=false limit 1; id | nm | attr | seg | exp | dir | uid | gid | perm | cntt --------------------------------------+-------------------------+------+-----------------------------------------------+--------------------------+-------+-----------+------------+------+-------------------------- e22d17f3-7f77-11e5-8313-0522d849d63b | /dir_108/aePzC_62755108 | null | {e23d6ba0-7f77-11e5-8313-0522d849d63b: 22528} | 2016-10-29 19:34:17-0700 | False | test_user | test_group | 1911 | application/octet-stream (1 rows) Tracing session: e69202b0-88cd-11e5-a7d5-2360c9ef4b33 activity | timestamp | source | source_elapsed -------------------------------------------------------------------------------------------------------------------------------------+--------------+--------------+---------------- execute_cql3_query | 15:42:42,382 | 10.65.230.23 | 0 Parsing select id, nm, attr, seg, exp, dir, uid, gid, perm, cntt from fil where nm='/dir_108/aePzC_62755108' and dir=false limit 1; | 15:42:42,382 | 10.65.230.23 | 112 Preparing statement | 15:42:42,382 | 10.65.230.23 | 333 Executing single-partition query on fil | 15:42:42,383 | 10.65.230.23 | 940 Acquiring sstable references | 15:42:42,383 | 10.65.230.23 | 966 Merging memtable tombstones | 15:42:42,383 | 10.65.230.23 | 1000 Bloom filter allows skipping sstable 3471 | 15:42:42,383 | 10.65.230.23 | 1121 Bloom filter allows skipping sstable 3432 | 15:42:42,383 | 10.65.230.23 | 1170 Bloom filter allows skipping sstable 3391 | 15:42:42,383 | 10.65.230.23 | 1199 Bloom filter allows skipping sstable 3387 | 15:42:42,383 | 10.65.230.23 | 1227 Bloom filter allows skipping sstable 3369 | 15:42:42,383 | 10.65.230.23 | 1317 Bloom filter allows skipping sstable 3364 | 15:42:42,383 | 10.65.230.23 | 1353 Bloom filter allows skipping sstable 3358 | 15:42:42,383 | 10.65.230.23 | 1387 Key cache hit for sstable 3385 | 15:42:42,383 | 10.65.230.23 | 1432 Seeking to partition indexed section in data file | 15:42:42,383 | 10.65.230.23 | 1443 Message received from /10.65.230.23 | 15:42:42,386 | 10.65.230.20 | 37 Sending message to /10.65.230.20 | 15:42:42,388 | 10.65.230.23 | 5886 Bloom filter allows skipping sstable 2984 | 15:42:42,390 | 10.65.230.23 | 7679 Bloom filter allows skipping sstable 2535 | 15:42:42,390 | 10.65.230.23 | 7737 Bloom filter allows skipping sstable 2452 | 15:42:42,390 | 10.65.230.23 | 7780 Bloom filter allows skipping sstable 2058 | 15:42:42,390 | 10.65.230.23 | 7824 Executing single-partition query on fil | 15:42:42,392 | 10.65.230.20 | 6571 Acquiring sstable references | 15:42:42,392 | 10.65.230.20 | 6884 Merging memtable tombstones | 15:42:42,392 | 10.65.230.20 | 6931 Bloom filter allows skipping sstable 1695 | 15:42:42,398 | 10.65.230.23 | 16437 Skipped 0/13 non-slice-intersecting sstables, included 0 due to tombstones | 15:42:42,398 | 10.65.230.23 | 16460 Merging data from memtables and 1 sstables | 15:42:42,398 | 10.65.230.23 | 16472 Read 1 live and 0 tombstoned cells | 15:42:42,398 | 10.65.230.23 | 16530 Bloom filter allows skipping sstable 3415 | 15:42:42,404 | 10.65.230.20 | 18138 Bloom filter allows skipping sstable 3393 | 15:42:42,404 | 10.65.230.20 | 18186 Partition index with 0 entries found for sstable 3359 | 15:42:42,420 | 10.65.230.20 | 34457 Seeking to partition indexed section in data file | 15:42:42,420 | 10.65.230.20 | 34544 Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 15:42:42,444 | 10.65.230.20 | 58228 Merging data from memtables and 1 sstables | 15:42:42,444 | 10.65.230.20 | 58374 Read 1 live and 0 tombstoned cells | 15:42:42,444 | 10.65.230.20 | 58570 Enqueuing response to /10.65.230.23 | 15:42:42,468 | 10.65.230.20 | 82602 Sending message to /10.65.230.23 | 15:42:42,468 | 10.65.230.20 | 82775 Message received from /10.65.230.20 | 15:42:42,472 | 10.65.230.23 | null Processing response from /10.65.230.20 | 15:42:42,473 | 10.65.230.23 | null Request complete | 15:42:42,402 | 10.65.230.23 | 20725
also this doesn't manifest much on new entries (close to top of the secondary index?) and gets worse with older entries.