Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Fix Version/s: 1.2.2
    • Component/s: API
    • Labels:
      None

      Description

      Currently, CQL 3.0 doesn't allow creating an index on a dynamic CF (with COMPACT STORAGE). The goal of this ticket is not to support the composite case however (CASSANDRA-3680 will tackle this).

      I think changes needed to support this are only in the CQL side and covert two area:

      • Finding a syntax for it
      • Currently, the CQL 3 code consider that a CF with any column_metadata defined is a non-compact cf. Basically the problem is that we currently use column_metadata both for defining a name for a column in the static case, and store indexing information. Ideally, we would separate those informations, i.e. we could add a new map valueAliases (ByteBuffer -> AbstractType) to CFMetadata (only used by static CF) and we would keep column_metadata for indexing purpose only. However that may be problematic for backward compatibility (with thrift in particular), so probably instead we can just add a new boolean isStaticColumnName to ColumnDefinition.

        Activity

        Hide
        Sylvain Lebresne added a comment -

        For the syntax, let's consider a typical dynamic CF in CQL 3.0:

        CREATE TABLE timeline (
          key int,
          time timestamp,
          event text,
          PRIMARY KEY (key, time)
        ) WITH COMPACT STORAGE
        

        then a syntax to declare an index could look like:

        CREATE INDEX index_name ON timeline WHERE time = 0;
        

        Alternatively, we could have it be:

        CREATE INDEX index_name ON timeline(0);
        

        but I feel that will be less intuitive.

        There is obviously a lot of possible variation. Maybe it could be worth keeping CASSANDRA-3680 in mind too for this.

        Show
        Sylvain Lebresne added a comment - For the syntax, let's consider a typical dynamic CF in CQL 3.0: CREATE TABLE timeline ( key int, time timestamp, event text, PRIMARY KEY (key, time) ) WITH COMPACT STORAGE then a syntax to declare an index could look like: CREATE INDEX index_name ON timeline WHERE time = 0; Alternatively, we could have it be: CREATE INDEX index_name ON timeline(0); but I feel that will be less intuitive. There is obviously a lot of possible variation. Maybe it could be worth keeping CASSANDRA-3680 in mind too for this.
        Hide
        Jonathan Ellis added a comment -

        I'm confused by this example. First, because in this case time is a "simple" column name so it's part of the row's "column index." I don't see what additional indexes would buy us. While I could see creating indexes on subordinate parts of a composite column name, I imagine that's what CASSANDRA-3680 has in mind. So in my mind the "index wide rows" part that's distinct from "index composite columns" is, "index the values part of wide row columns," i.e., event in this example.

        The other reason I'm confused is because the general case is people want to index everything in a column; while partial indexes (http://www.postgresql.org/docs/9.1/static/indexes-partial.html) can certainly be useful, it's more of an "advanced" feature (MySQL doesn't support it at all).

        Show
        Jonathan Ellis added a comment - I'm confused by this example. First, because in this case time is a "simple" column name so it's part of the row's "column index." I don't see what additional indexes would buy us. While I could see creating indexes on subordinate parts of a composite column name, I imagine that's what CASSANDRA-3680 has in mind. So in my mind the "index wide rows" part that's distinct from "index composite columns" is, "index the values part of wide row columns," i.e., event in this example. The other reason I'm confused is because the general case is people want to index everything in a column; while partial indexes ( http://www.postgresql.org/docs/9.1/static/indexes-partial.html ) can certainly be useful, it's more of an "advanced" feature (MySQL doesn't support it at all).
        Hide
        Sylvain Lebresne added a comment -

        I'll refer to the following comment on CASSANDRA-3761:
        https://issues.apache.org/jira/browse/CASSANDRA-3761?focusedCommentId=13191984&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13191984

        I fully agree my example is not particularly clear in that you can wonder why you would do that indexing. I really only picked it to illustrate the proposed syntax.

        So in the comment above, Thorsten gives two examples of using a secondary index (the ones we have right now) for a wide row. As he is himself admitting, it's abusing a bit the column name, basically creating a column with a fixed name in a wide row just for the purpose of re-using the secondary index mechanism to find rows based on some criteria. Of course, we could decide whether we want to disallow this on purpose, but I don't see a very good reason to do that since it's not a lot of work, at least someone cares about it and both thrift and CQL 2.0 allows it.

        Show
        Sylvain Lebresne added a comment - I'll refer to the following comment on CASSANDRA-3761 : https://issues.apache.org/jira/browse/CASSANDRA-3761?focusedCommentId=13191984&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13191984 I fully agree my example is not particularly clear in that you can wonder why you would do that indexing. I really only picked it to illustrate the proposed syntax. So in the comment above, Thorsten gives two examples of using a secondary index (the ones we have right now) for a wide row. As he is himself admitting, it's abusing a bit the column name, basically creating a column with a fixed name in a wide row just for the purpose of re-using the secondary index mechanism to find rows based on some criteria. Of course, we could decide whether we want to disallow this on purpose, but I don't see a very good reason to do that since it's not a lot of work, at least someone cares about it and both thrift and CQL 2.0 allows it.
        Hide
        Jonathan Ellis added a comment -

        I see. That makes sense, although I think it may be confusing that the user would need to specify values for exactly each part of the PK declaration.

        Show
        Jonathan Ellis added a comment - I see. That makes sense, although I think it may be confusing that the user would need to specify values for exactly each part of the PK declaration.
        Hide
        Sylvain Lebresne added a comment -

        I think it may be confusing that the user would need to specify values for exactly each part of the PK declaration

        I suppose that's more a problem for CASSANDRA-3680. But actually it may be worth dealing with this at the same time we deal with CASSANDRA-3680. I initially wanted to separate the two issues because for non-composite wide rows we wouldn't really need to change anything to secondary indexes to support this. But now I'm a bit afraid that we'll do something here that is not coherent with CASSANDRA-3680.

        Show
        Sylvain Lebresne added a comment - I think it may be confusing that the user would need to specify values for exactly each part of the PK declaration I suppose that's more a problem for CASSANDRA-3680 . But actually it may be worth dealing with this at the same time we deal with CASSANDRA-3680 . I initially wanted to separate the two issues because for non-composite wide rows we wouldn't really need to change anything to secondary indexes to support this. But now I'm a bit afraid that we'll do something here that is not coherent with CASSANDRA-3680 .
        Hide
        Sylvain Lebresne added a comment -

        I'm going to close that one because I think CASSANDRA-5125 pretty much give us
        what we want here for CQL3.

        Show
        Sylvain Lebresne added a comment - I'm going to close that one because I think CASSANDRA-5125 pretty much give us what we want here for CQL3.

          People

          • Assignee:
            Sylvain Lebresne
            Reporter:
            Sylvain Lebresne
          • Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development