Cassandra
  1. Cassandra
  2. CASSANDRA-4242

Name of parameters should be available in CqlPreparedResult

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.1.1
    • Component/s: Core
    • Labels:
      None

      Description

      Client side, it could be nice to have the name of parameters in CqlPreparedResult. This could allow parameters mapping by name instead of by index.

      struct CqlNameType {
          1: required binary key,
      	2: required string type
      }
      
      struct CqlPreparedResult {
          1: required i32 itemId,
          2: required i32 count,
          3: optional list<string> variable_types,
          4: optional list<CqlNameType> name_types
      }
      
      1. 4242_2.txt
        13 kB
        Pierre Chalamet
      2. 4242_3.txt
        15 kB
        Pierre Chalamet
      3. 4242_4.txt
        17 kB
        Pierre Chalamet
      4. 4242.txt
        146 kB
        Pierre Chalamet

        Activity

        Hide
        Pierre Chalamet added a comment -

        using a map instead of 2 lists

        Show
        Pierre Chalamet added a comment - using a map instead of 2 lists
        Hide
        Pierre Chalamet added a comment -

        moved from two lists to one map.

        Show
        Pierre Chalamet added a comment - moved from two lists to one map.
        Hide
        Pierre Chalamet added a comment -

        use list instead of map since the order is important

        Show
        Pierre Chalamet added a comment - use list instead of map since the order is important
        Hide
        Jonathan Ellis added a comment -

        Patch fails to apply to latest trunk. Can you rebase? (Not necessary to include gen-java changes.)

        Show
        Jonathan Ellis added a comment - Patch fails to apply to latest trunk. Can you rebase? (Not necessary to include gen-java changes.)
        Hide
        Jonathan Ellis added a comment -

        Sorry, I didn't realize you'd posted the core of the change in the issue description...

        I'd prefer a simple Map of column name to column type. With CQL3 we don't need all the default name/value baggage:

        diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift
        index 148f277..89850d2 100644
        --- a/interface/cassandra.thrift
        +++ b/interface/cassandra.thrift
        @@ -491,13 +491,15 @@ struct CqlResult {
             1: required CqlResultType type,
             2: optional list<CqlRow> rows,
             3: optional i32 num,
        -    4: optional CqlMetadata schema
        +    4: optional CqlMetadata schema, # CQL2
        +    5: optional map<string,string> column_types # CQL3
         }
        
         struct CqlPreparedResult {
             1: required i32 itemId,
             2: required i32 count,
        -    3: optional list<string> variable_types
        +    3: optional list<string> variable_types, # CQL2
        +    4: optional map<string,string> column_types CQL3
         }
        
        Show
        Jonathan Ellis added a comment - Sorry, I didn't realize you'd posted the core of the change in the issue description... I'd prefer a simple Map of column name to column type. With CQL3 we don't need all the default name/value baggage: diff --git a/interface/cassandra.thrift b/interface/cassandra.thrift index 148f277..89850d2 100644 --- a/interface/cassandra.thrift +++ b/interface/cassandra.thrift @@ -491,13 +491,15 @@ struct CqlResult { 1: required CqlResultType type, 2: optional list<CqlRow> rows, 3: optional i32 num, - 4: optional CqlMetadata schema + 4: optional CqlMetadata schema, # CQL2 + 5: optional map<string,string> column_types # CQL3 } struct CqlPreparedResult { 1: required i32 itemId, 2: required i32 count, - 3: optional list<string> variable_types + 3: optional list<string> variable_types, # CQL2 + 4: optional map<string,string> column_types CQL3 }
        Hide
        Pierre Chalamet added a comment -

        The problem is

        CqlResult execute_prepared_cql_query(1:required i32 itemId, 2:required list<binary> values)

        .
        If a new operation with a map of parameters is available why not - but with the current incarnation, I do not see how we can feed parameters from a map (unordered) to a list of values (ordered). Something like

        CqlResult execute_prepared_cql_query_with_named_params(1:required i32 itemId, 2:required map<binary, binary> values)

        .

        Btw, can't find CqlResult.column_types in trunk. Where can I get it ?

        And I would prefer a binary type for column names instead of string.

        Show
        Pierre Chalamet added a comment - The problem is CqlResult execute_prepared_cql_query(1:required i32 itemId, 2:required list<binary> values) . If a new operation with a map of parameters is available why not - but with the current incarnation, I do not see how we can feed parameters from a map (unordered) to a list of values (ordered). Something like CqlResult execute_prepared_cql_query_with_named_params(1:required i32 itemId, 2:required map<binary, binary> values) . Btw, can't find CqlResult.column_types in trunk. Where can I get it ? And I would prefer a binary type for column names instead of string.
        Hide
        Jonathan Ellis added a comment -

        I do not see how we can feed parameters from a map (unordered) to a list of values

        But why would we need to? I thought the goal here is introspection of what a prepared statement returns. Executing the statement works fine.

        can't find CqlResult.column_types in trunk

        Because that's a change I'm proposing.

        I would prefer a binary type for column names instead of string.

        Column names are always strings in CQL3. See http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 for a quick summary and CASSANDRA-2474 for the gory details.

        Show
        Jonathan Ellis added a comment - I do not see how we can feed parameters from a map (unordered) to a list of values But why would we need to? I thought the goal here is introspection of what a prepared statement returns. Executing the statement works fine. can't find CqlResult.column_types in trunk Because that's a change I'm proposing. I would prefer a binary type for column names instead of string. Column names are always strings in CQL3. See http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 for a quick summary and CASSANDRA-2474 for the gory details.
        Hide
        Pierre Chalamet added a comment - - edited

        But why would we need to? I thought the goal here is introspection of what a prepared statement returns. Executing the statement works fine.

        I thought the cinematic was:
        1/ prepare_cql_query
        2/ execute_prepared_cql_query using parameters as specified by 1/

        I just do not want to know the name of the parameters in CqlPreparedResult, I need this to bind parameters from various data source (which expose unordered named values).
        The order of values is then really important from pov of the execution of query.

        Am I missing something so ?

        Column names are always strings in CQL3. See http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 for a quick summary and CASSANDRA-2474 for the gory details.

        Sorry, I should have read the spec first

        Show
        Pierre Chalamet added a comment - - edited But why would we need to? I thought the goal here is introspection of what a prepared statement returns. Executing the statement works fine. I thought the cinematic was: 1/ prepare_cql_query 2/ execute_prepared_cql_query using parameters as specified by 1/ I just do not want to know the name of the parameters in CqlPreparedResult, I need this to bind parameters from various data source (which expose unordered named values). The order of values is then really important from pov of the execution of query. Am I missing something so ? Column names are always strings in CQL3. See http://www.datastax.com/dev/blog/schema-in-cassandra-1-1 for a quick summary and CASSANDRA-2474 for the gory details. Sorry, I should have read the spec first
        Hide
        Pierre Chalamet added a comment -

        patch rebase from trunk

        Show
        Pierre Chalamet added a comment - patch rebase from trunk
        Hide
        Pierre Chalamet added a comment -

        Forget about my comment on hex encoded strings. Should read twice what I'm writing.

        Show
        Pierre Chalamet added a comment - Forget about my comment on hex encoded strings. Should read twice what I'm writing.
        Hide
        Jonathan Ellis added a comment -

        I need this to bind parameters from various data source (which expose unordered named values)

        Well, yes, it's definitely designed around data sources that give you ordered values instead. This fits at the least JDBC, Python DBAPI, and Ruby DBI. What are you using that can't work with this paradigm?

        Show
        Jonathan Ellis added a comment - I need this to bind parameters from various data source (which expose unordered named values) Well, yes, it's definitely designed around data sources that give you ordered values instead. This fits at the least JDBC, Python DBAPI, and Ruby DBI. What are you using that can't work with this paradigm?
        Hide
        Pierre Chalamet added a comment -

        Well, yes, it's definitely designed around data sources that give you ordered values instead. This fits at the least JDBC, Python DBAPI, and Ruby DBI. What are you using that can't work with this paradigm?

        There are basically 2 use cases:

        • micro-orm like (mapping object to and from a query)
        • feeding from external data source (like a rowset which usually support ordered and named columns).

        Version in trunk of cassandra-sharp (my cassandra .net client) support actually a micro-orm interface (with this patch) - it is still under development but this will probably looks like this:

        [Schema("TestKeyspace", Comment = "People table", Name="People")]
        public class PeopleSchema
        {
        	[Index(Name = "birthyear")]
        	public int Birthyear;
        
        	[Key(Name = "firstname")]
        	public string FirstName;
        
        	[Column(Name = "lastname")]
        	public string LastName;
        }
        
        cluster.Create<PeopleSchema>();
        
        cluster.Execute("insert into People (firstname, lastname, birthyear) values (?, ?, ?)",
                        new {firstname = "pierre", lastname = "chalamet", birthyear = 1973});
        

        I do not want this framework to be user driven (ie: the user provides the parameters and knows the nitty gritty details of his query) - instead I want this framework to be query driven for parameters feeding (ie: the query determines what is necessary to complete the execution). This way, the user can change its query without changing the parameters order - it is still the plain old .net object exposing unordered properties. You can refactor as you want the query or the .net object, it should still work.

        The other interface I'd like to setup in the micro-orm area is something like the on in Gigaspaces (query template with plain old object) - the interface is really good for getting results. For extended queries, it is also required to be able to map parameters.

        The second use case is less obvious, but suppose I need to transfer data between a database and C*.
        I would select on one side and insert on the other side - using something like a data reader in .net.
        For that, I still do not want to rely on order of the column in the rowset - I would prefer to discover the structure and bind parameters accordingly using a the data reader metadata (basically column names).

        I do believe it is good functionality client side. This will allow more way to interact with C* in safer way without much cost anyway.
        Compared to the cost of the i/o, retrieving the column names in CqlPreparedResult and mapping client side is really cheap.

        I've also read CASSANDRA-2474 - quite interesting thanks for the link. But I really would prefer to have binary in column name instead of string. That's why in the new patch I did not changed this. I'm not really clear on this, I had to think more about it, but as is, I have the feeling that C* is moving away from sliced query on binary column names (a strong feature in my opinion). It might does stand when considering hadoop, hive and co... I might be the only guy using C* alone anyway.
        Moreover, CqlMetadata exposes column names as binary - I would also prefer something symmetrical (weak argument anyway).

        Show
        Pierre Chalamet added a comment - Well, yes, it's definitely designed around data sources that give you ordered values instead. This fits at the least JDBC, Python DBAPI, and Ruby DBI. What are you using that can't work with this paradigm? There are basically 2 use cases: micro-orm like (mapping object to and from a query) feeding from external data source (like a rowset which usually support ordered and named columns). Version in trunk of cassandra-sharp (my cassandra .net client) support actually a micro-orm interface (with this patch) - it is still under development but this will probably looks like this: [Schema( "TestKeyspace" , Comment = "People table" , Name= "People" )] public class PeopleSchema { [Index(Name = "birthyear" )] public int Birthyear; [Key(Name = "firstname" )] public string FirstName; [Column(Name = "lastname" )] public string LastName; } cluster.Create<PeopleSchema>(); cluster.Execute( "insert into People (firstname, lastname, birthyear) values (?, ?, ?)" , new {firstname = "pierre" , lastname = "chalamet" , birthyear = 1973}); I do not want this framework to be user driven (ie: the user provides the parameters and knows the nitty gritty details of his query) - instead I want this framework to be query driven for parameters feeding (ie: the query determines what is necessary to complete the execution). This way, the user can change its query without changing the parameters order - it is still the plain old .net object exposing unordered properties. You can refactor as you want the query or the .net object, it should still work. The other interface I'd like to setup in the micro-orm area is something like the on in Gigaspaces (query template with plain old object) - the interface is really good for getting results. For extended queries, it is also required to be able to map parameters. The second use case is less obvious, but suppose I need to transfer data between a database and C*. I would select on one side and insert on the other side - using something like a data reader in .net. For that, I still do not want to rely on order of the column in the rowset - I would prefer to discover the structure and bind parameters accordingly using a the data reader metadata (basically column names). I do believe it is good functionality client side. This will allow more way to interact with C* in safer way without much cost anyway. Compared to the cost of the i/o, retrieving the column names in CqlPreparedResult and mapping client side is really cheap. I've also read CASSANDRA-2474 - quite interesting thanks for the link. But I really would prefer to have binary in column name instead of string. That's why in the new patch I did not changed this. I'm not really clear on this, I had to think more about it, but as is, I have the feeling that C* is moving away from sliced query on binary column names (a strong feature in my opinion). It might does stand when considering hadoop, hive and co... I might be the only guy using C* alone anyway. Moreover, CqlMetadata exposes column names as binary - I would also prefer something symmetrical (weak argument anyway).
        Hide
        Pierre Chalamet added a comment -

        ensure compatibility with cql 3 client.

        Show
        Pierre Chalamet added a comment - ensure compatibility with cql 3 client.
        Hide
        Sylvain Lebresne added a comment -

        I can see the point of returning the names in the CqlPreparedResult. However, despite Jonathan's suggestion, I actually prefered the original one (except for the binary->string), i.e.:

        struct CqlPreparedResult {
             1: required i32 itemId,
             2: required i32 count,
             3: optional list<string> variable_types
        +    4: optional list<string> variable_names
        }
        

        The reason is mainly that I don't see the point in breaking CQL3 clients. I agree CqlMetadata is overkill for CQL3, but imo CASSANDRA-2478 is where we should bother fixing this.

        Also, the variable_types in CqlPreparedResult is actually not a CQL2 thing, but a CQL3 thing exclusively, so deprecating it in say 1.1.1 when it was introduced in 1.1.0 feels again a bit harsh on the client writers for no good reasons.

        I have the feeling that C* is moving away from sliced query on binary column names (a strong feature in my opinion)

        It does not. The difference is that a column name in CQL3 does not necessarily map to a internal column name. So CQL3 column names are strings and this without any loss of generality.

        Moreover, CqlMetadata exposes column names as binary

        Yes, but that part is due to the difference between CQL2 and CQL3. Concretely, CQL3 will never return anything else than UTF8 bytes (which is why Jonathan suggested replacing CqlMetadata by a map<string, string> for CQL3, but see my argument above). In any case, given that the variable names will be CQL3 only (we cannot, even if we wanted to, do it in CQL2), I don't think there is a point in returning binary.

        Show
        Sylvain Lebresne added a comment - I can see the point of returning the names in the CqlPreparedResult. However, despite Jonathan's suggestion, I actually prefered the original one (except for the binary->string), i.e.: struct CqlPreparedResult { 1: required i32 itemId, 2: required i32 count, 3: optional list<string> variable_types + 4: optional list<string> variable_names } The reason is mainly that I don't see the point in breaking CQL3 clients. I agree CqlMetadata is overkill for CQL3, but imo CASSANDRA-2478 is where we should bother fixing this. Also, the variable_types in CqlPreparedResult is actually not a CQL2 thing, but a CQL3 thing exclusively, so deprecating it in say 1.1.1 when it was introduced in 1.1.0 feels again a bit harsh on the client writers for no good reasons. I have the feeling that C* is moving away from sliced query on binary column names (a strong feature in my opinion) It does not. The difference is that a column name in CQL3 does not necessarily map to a internal column name. So CQL3 column names are strings and this without any loss of generality. Moreover, CqlMetadata exposes column names as binary Yes, but that part is due to the difference between CQL2 and CQL3. Concretely, CQL3 will never return anything else than UTF8 bytes (which is why Jonathan suggested replacing CqlMetadata by a map<string, string> for CQL3, but see my argument above). In any case, given that the variable names will be CQL3 only (we cannot, even if we wanted to, do it in CQL2), I don't think there is a point in returning binary.
        Hide
        Jonathan Ellis added a comment -

        I don't see the point in breaking CQL3 clients

        fair enough.

        Show
        Jonathan Ellis added a comment - I don't see the point in breaking CQL3 clients fair enough.
        Hide
        Pierre Chalamet added a comment -

        Reverting to 2 lists in CqlPreparedResult
        Column names are String now

        Show
        Pierre Chalamet added a comment - Reverting to 2 lists in CqlPreparedResult Column names are String now
        Hide
        Sylvain Lebresne added a comment -

        +1, committed, thanks.

        Show
        Sylvain Lebresne added a comment - +1, committed, thanks.

          People

          • Assignee:
            Pierre Chalamet
            Reporter:
            Pierre Chalamet
            Reviewer:
            Sylvain Lebresne
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development