Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-8180

Optimize disk seek using min/max column name meta data when the LIMIT clause is used

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 3.4
    • None
    • Cassandra 2.0.10

    Description

      I was working on an example of sensor data table (timeseries) and face a use case where C* does not optimize read on disk.

      cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) WITH CLUSTERING ORDER BY (col DESC);
      cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
      ...
      >nodetool flush test test
      ...
      cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
      ...
      >nodetool flush test test
      ...
      cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
      ...
      >nodetool flush test test
      

      After that, I activate request tracing:

      cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
       activity                                                                  | timestamp    | source    | source_elapsed
      ---------------------------------------------------------------------------+--------------+-----------+----------------
                                                              execute_cql3_query | 23:48:46,498 | 127.0.0.1 |              0
                                  Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 23:48:46,498 | 127.0.0.1 |             74
                                                             Preparing statement | 23:48:46,499 | 127.0.0.1 |            253
                                        Executing single-partition query on test | 23:48:46,499 | 127.0.0.1 |            930
                                                    Acquiring sstable references | 23:48:46,499 | 127.0.0.1 |            943
                                                     Merging memtable tombstones | 23:48:46,499 | 127.0.0.1 |           1032
                                                     Key cache hit for sstable 3 | 23:48:46,500 | 127.0.0.1 |           1160
                                     Seeking to partition beginning in data file | 23:48:46,500 | 127.0.0.1 |           1173
                                                     Key cache hit for sstable 2 | 23:48:46,500 | 127.0.0.1 |           1889
                                     Seeking to partition beginning in data file | 23:48:46,500 | 127.0.0.1 |           1901
                                                     Key cache hit for sstable 1 | 23:48:46,501 | 127.0.0.1 |           2373
                                     Seeking to partition beginning in data file | 23:48:46,501 | 127.0.0.1 |           2384
       Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 23:48:46,501 | 127.0.0.1 |           2768
                                      Merging data from memtables and 3 sstables | 23:48:46,501 | 127.0.0.1 |           2784
                                              Read 2 live and 0 tombstoned cells | 23:48:46,501 | 127.0.0.1 |           2976
                                                                Request complete | 23:48:46,501 | 127.0.0.1 |           3551
      

      We can clearly see that C* hits 3 SSTables on disk instead of just one, although it has the min/max column meta data to decide which SSTable contains the most recent data.

      Funny enough, if we add a clause on the clustering column to the select, this time C* optimizes the read path:

      cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
       activity                                                                  | timestamp    | source    | source_elapsed
      ---------------------------------------------------------------------------+--------------+-----------+----------------
                                                              execute_cql3_query | 23:52:31,888 | 127.0.0.1 |              0
                     Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | 23:52:31,888 | 127.0.0.1 |             60
                                                             Preparing statement | 23:52:31,888 | 127.0.0.1 |            277
                                        Executing single-partition query on test | 23:52:31,889 | 127.0.0.1 |            961
                                                    Acquiring sstable references | 23:52:31,889 | 127.0.0.1 |            971
                                                     Merging memtable tombstones | 23:52:31,889 | 127.0.0.1 |           1020
                                                     Key cache hit for sstable 3 | 23:52:31,889 | 127.0.0.1 |           1108
                                     Seeking to partition beginning in data file | 23:52:31,889 | 127.0.0.1 |           1117
       Skipped 2/3 non-slice-intersecting sstables, included 0 due to tombstones | 23:52:31,889 | 127.0.0.1 |           1611
                                      Merging data from memtables and 1 sstables | 23:52:31,890 | 127.0.0.1 |           1624
                                              Read 1 live and 0 tombstoned cells | 23:52:31,890 | 127.0.0.1 |           1700
                                                                Request complete | 23:52:31,890 | 127.0.0.1 |           2140
      

      Attachments

        1. 8180_001.yaml
          0.7 kB
          Stefania Alborghetti
        2. 8180_002.yaml
          0.7 kB
          Stefania Alborghetti

        Issue Links

          Activity

            People

              stefania Stefania Alborghetti
              doanduyhai DuyHai Doan
              Stefania Alborghetti
              Branimir Lambov
              Votes:
              7 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: