Cassandra
  1. Cassandra
  2. CASSANDRA-4210

Support for variadic parameters list for "in clause" in prepared cql query

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 2.0.1
    • Component/s: Core
    • Labels:
      None
    • Environment:

      prepared cql queries

      Description

      This query

      select * from Town where key in (?)
      

      only allows one parameter for '?'.

      This means querying for 'Paris' and 'London' can't be executed in one step with this prepared statement.

      Current workarounds are:

      • either execute the prepared query 2 times with 'Paris' then 'London'
      • or prepare a new query select * from Town where key in (?, ?) and bind the 2 parameters

      Having a support for variadic parameters list with in clause could improve performance:

      • single hop to get the data
      • // fetching server side
      1. 4210.txt
        47 kB
        Sylvain Lebresne

        Issue Links

          Activity

          Hide
          Sylvain Lebresne added a comment -

          Committed, thanks

          Show
          Sylvain Lebresne added a comment - Committed, thanks
          Hide
          Aleksey Yeschenko added a comment -

          The code LGTM. Confirmed to work for UPDATE/SELECT, and I don't see any obvious regressions (neither do dtests). +1

          Show
          Aleksey Yeschenko added a comment - The code LGTM. Confirmed to work for UPDATE/SELECT, and I don't see any obvious regressions (neither do dtests). +1
          Hide
          Sylvain Lebresne added a comment -

          Attaching patch for this.

          Show
          Sylvain Lebresne added a comment - Attaching patch for this.
          Hide
          Sylvain Lebresne added a comment -

          just tells that the parameter is of type 'key', not a set of type 'key'.

          Yes, because that's like if you were writing

          select * from Town where key in (2)
          

          but with a marker.

          Which means that if we do support that, it'll probably have to use the syntax:

          select * from Town where key in ?
          

          for compatibility reasons.

          But I'd like to note that "multiget" queries are not necessary more efficient. It's not because the server don't don't anything smart about it, except parallelizing each individual queries (but that's really all it does). For that reason, doing the parallelization client side provides the benefit that you can start processing answers as they come, instead for waiting for the full result set. So even though you do "waste" a little more network bandwith between the client and server, I suspect that in a lot of use cases, you may actually get better throughput by parallelizing server side.

          Anyway, that digression aside, I have no problem adding variadic IN (up to the minor syntax detail above) if only because when IN is used for the clustering key (versus the partition one), then it does always improve performance.

          Show
          Sylvain Lebresne added a comment - just tells that the parameter is of type 'key', not a set of type 'key'. Yes, because that's like if you were writing select * from Town where key in (2) but with a marker. Which means that if we do support that, it'll probably have to use the syntax: select * from Town where key in ? for compatibility reasons. But I'd like to note that "multiget" queries are not necessary more efficient. It's not because the server don't don't anything smart about it, except parallelizing each individual queries (but that's really all it does). For that reason, doing the parallelization client side provides the benefit that you can start processing answers as they come, instead for waiting for the full result set. So even though you do "waste" a little more network bandwith between the client and server, I suspect that in a lot of use cases, you may actually get better throughput by parallelizing server side. Anyway, that digression aside, I have no problem adding variadic IN (up to the minor syntax detail above) if only because when IN is used for the clustering key (versus the partition one), then it does always improve performance.
          Hide
          Pierre Chalamet added a comment -

          This is still a problem when trying to bind an IN parameter for prepared statement even in 1.2.4. For what I've seen, the column spec returned after preparing

          select * from Town where key in (?)
          

          just tells that the parameter is of type 'key', not a set of type 'key'.

          This would be really nice for binary protocol driver to know they could bind a set of value for such parameter (and I'm pretty sure this info is known when the statement is prepared).

          Show
          Pierre Chalamet added a comment - This is still a problem when trying to bind an IN parameter for prepared statement even in 1.2.4. For what I've seen, the column spec returned after preparing select * from Town where key in (?) just tells that the parameter is of type 'key', not a set of type 'key'. This would be really nice for binary protocol driver to know they could bind a set of value for such parameter (and I'm pretty sure this info is known when the statement is prepared).

            People

            • Assignee:
              Sylvain Lebresne
              Reporter:
              Pierre Chalamet
              Reviewer:
              Aleksey Yeschenko
            • Votes:
              5 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development