Cassandra
  1. Cassandra
  2. CASSANDRA-4536

Ability for CQL3 to list partition keys

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 2.0.1
    • Component/s: API
    • Labels:

      Description

      It can be useful to know the set of in-use partition keys (storage engine row keys). One example given to me was where application data was modeled as a few 10s of 1000s of wide rows, where the app required presenting these rows to the user sorted based on information in the partition key. The partition count is small enough to do the sort client-side in memory, which is what the app did with the Thrift API--a range slice with an empty columns list.

      This was a problem when migrating to CQL3. SELECT mykey FROM mytable includes all the logical rows, which makes the resultset too large to make this a reasonable approach, even with paging.

      One way to add support would be to allow DISTINCT in the special case of SELECT DISTINCT mykey FROM mytable.

      1. 4536.txt
        13 kB
        Aleksey Yeschenko
      2. cassandra-4536_1.1.0.patch
        8 kB
        dan jatnieks
      3. cassandra-4536_1.2.2.patch
        107 kB
        dan jatnieks
      4. cassandra-4536_1.2.5.patch
        8 kB
        dan jatnieks

        Activity

        Hide
        Sylvain Lebresne added a comment -

        Let me note that in CQL3 a row that have no live column don't exist, so we can't really implement this with a range slice having an empty columns list. Instead we should do a range slice with a full-row slice predicate with a count of 1, to make sure we do have a live column before including the partition key. The downside being that this won't 'just read the index file'.

        On the longer run, it should be possible to optimize that further if we consider it worth it by adding a 1 bit per key info in the sstable index saying 'is there at least one live column for that key in that sstable' (we could even add that bit-per-key without augmenting the on-disk index size if we want to by using the first bit of the key position (since we use it as a signed long and thus the first bit is unused)).

        Show
        Sylvain Lebresne added a comment - Let me note that in CQL3 a row that have no live column don't exist, so we can't really implement this with a range slice having an empty columns list. Instead we should do a range slice with a full-row slice predicate with a count of 1, to make sure we do have a live column before including the partition key. The downside being that this won't 'just read the index file'. On the longer run, it should be possible to optimize that further if we consider it worth it by adding a 1 bit per key info in the sstable index saying 'is there at least one live column for that key in that sstable' (we could even add that bit-per-key without augmenting the on-disk index size if we want to by using the first bit of the key position (since we use it as a signed long and thus the first bit is unused)).
        Hide
        dan jatnieks added a comment -

        Attaching patch file (cassandra-4536-1.1.0.patch) that I worked on based on version 1.1.

        I'll post a 1.2 version soon as I get it completed.

        Show
        dan jatnieks added a comment - Attaching patch file (cassandra-4536-1.1.0.patch) that I worked on based on version 1.1. I'll post a 1.2 version soon as I get it completed.
        Hide
        dan jatnieks added a comment -

        Attaching patch file (cassandra-4536_1.2.2.patch).

        Show
        dan jatnieks added a comment - Attaching patch file (cassandra-4536_1.2.2.patch).
        Hide
        dan jatnieks added a comment -

        Uploaded patch file (cassandra-4536_1.2.5) based on 1.2 branch.

        Show
        dan jatnieks added a comment - Uploaded patch file (cassandra-4536_1.2.5) based on 1.2 branch.
        Hide
        Sylvain Lebresne added a comment -

        The main missing part I think is that we should handle composite partition keys. Meaning that we should allow DISTINCT only if it's on all the (CQL3) partition key columns, but there may be more than one.

        Also, in SelectStatement.process(), it seems we only allow CQL3 tables. Why not just move the isDistinct block at the beginning to include all cases?

        Other minor remarks/nits:

        • In the parser, I'd rather just have K_DISTINCT optional in front of normal selectClause and do validation later in SelectStatement, rather than having a special selectDistinctClause (partly to keep the parser simpler, but also because we can return better error messages that way). We'd need to support distinct on multiple columns for composite partition keys anyway.
        • In makeFilter(): there's a ColumnSlice.ALL_COLUMNS_ARRAY to shorten that further. Also, we don't care about reversed, so since reversed slice are slightly slower, let's never reverse.
        Show
        Sylvain Lebresne added a comment - The main missing part I think is that we should handle composite partition keys. Meaning that we should allow DISTINCT only if it's on all the (CQL3) partition key columns, but there may be more than one. Also, in SelectStatement.process(), it seems we only allow CQL3 tables. Why not just move the isDistinct block at the beginning to include all cases? Other minor remarks/nits: In the parser, I'd rather just have K_DISTINCT optional in front of normal selectClause and do validation later in SelectStatement, rather than having a special selectDistinctClause (partly to keep the parser simpler, but also because we can return better error messages that way). We'd need to support distinct on multiple columns for composite partition keys anyway. In makeFilter(): there's a ColumnSlice.ALL_COLUMNS_ARRAY to shorten that further. Also, we don't care about reversed, so since reversed slice are slightly slower, let's never reverse.
        Show
        Aleksey Yeschenko added a comment - dtest at https://github.com/riptano/cassandra-dtest/commit/58a8f8e968c5e905cd8f1c72f570f51658c58223
        Hide
        Sylvain Lebresne added a comment -

        lgtm, +1

        (I've created CASSANDRA-5912 as a follow up if we want to get fancy and optimize that further. Probably not a priority though).

        Show
        Sylvain Lebresne added a comment - lgtm, +1 (I've created CASSANDRA-5912 as a follow up if we want to get fancy and optimize that further. Probably not a priority though).
        Hide
        Aleksey Yeschenko added a comment -

        Committed, thanks.

        Show
        Aleksey Yeschenko added a comment - Committed, thanks.

          People

          • Assignee:
            Aleksey Yeschenko
            Reporter:
            Jonathan Ellis
            Reviewer:
            Sylvain Lebresne
          • Votes:
            7 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development