Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1235

Push down LIMIT in unfiltered Cassandra queries

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.0
    • Fix Version/s: 1.8.0
    • Component/s: cassandra
    • Labels:
      None

      Activity

      Hide
      michaelmior Michael Mior added a comment -

      PR on GitHub. If someone could take a look before I merge, that would be great!

      Show
      michaelmior Michael Mior added a comment - PR on GitHub . If someone could take a look before I merge, that would be great!
      Hide
      julianhyde Julian Hyde added a comment -

      I'll review.

      Show
      julianhyde Julian Hyde added a comment - I'll review.
      Hide
      julianhyde Julian Hyde added a comment -

      It's a bit surprising to see a rule depend on EnumerableLimit, because it is a physical operator. I know there is no logical "pure" limit, only Sort, which may contain a limit. I suspect you could still find the right plan if you operated on a logical sort, pushing down the limit only if the sort key is empty.

      I saw that you have a predicate that always returns true. You can use Predicates.<CassandraSort>alwaysTrue().

      Show
      julianhyde Julian Hyde added a comment - It's a bit surprising to see a rule depend on EnumerableLimit , because it is a physical operator. I know there is no logical "pure" limit, only Sort, which may contain a limit. I suspect you could still find the right plan if you operated on a logical sort, pushing down the limit only if the sort key is empty. I saw that you have a predicate that always returns true. You can use Predicates.<CassandraSort>alwaysTrue() .
      Hide
      julianhyde Julian Hyde added a comment -

      I think you could easily support LIMIT+OFFSET and empty sort key, e.g. select * from "users" limit 3 offset 2, pushing "limit 5" down to cassandra if cassandra doesn't support offset. It's worth adding that test case either way.

      Similarly, I don't know whether cassandra supports sort+LIMIT (i.e. sort with LIMIT and non-empty sort key) but you should add the test case whether or not you push sort+limit down, just to make sure that nothing bad happens.

      Show
      julianhyde Julian Hyde added a comment - I think you could easily support LIMIT+OFFSET and empty sort key, e.g. select * from "users" limit 3 offset 2 , pushing "limit 5" down to cassandra if cassandra doesn't support offset. It's worth adding that test case either way. Similarly, I don't know whether cassandra supports sort+LIMIT (i.e. sort with LIMIT and non-empty sort key) but you should add the test case whether or not you push sort+limit down, just to make sure that nothing bad happens.
      Hide
      michaelmior Michael Mior added a comment -

      Thanks for the review! I agree that it makes sense to still push limit+offset down. Cassandra does support sort+limit which is the main reason that limit is currently handled by CassandraSort. I'll take a closer look at how I might be able to rewrite the rules to only deal with Sort. I didn't realize that the logical sort could be used for limit with an empty key.

      Show
      michaelmior Michael Mior added a comment - Thanks for the review! I agree that it makes sense to still push limit+offset down. Cassandra does support sort+limit which is the main reason that limit is currently handled by CassandraSort . I'll take a closer look at how I might be able to rewrite the rules to only deal with Sort . I didn't realize that the logical sort could be used for limit with an empty key.
      Hide
      michaelmior Michael Mior added a comment -

      Unfortunately I don't see a good way to get rid of rules involving EnumerableLimit. The problem is that to push down sort + limit there needs to be an instance of CassandraFilter. However, at the point where this rule has been applied, it seems like EnumerableFilterRule has already been applied. Also, I'm going to create a separate JIRA to deal with pushing down offset.

      Show
      michaelmior Michael Mior added a comment - Unfortunately I don't see a good way to get rid of rules involving EnumerableLimit . The problem is that to push down sort + limit there needs to be an instance of CassandraFilter . However, at the point where this rule has been applied, it seems like EnumerableFilterRule has already been applied. Also, I'm going to create a separate JIRA to deal with pushing down offset.
      Hide
      michaelmior Michael Mior added a comment -

      To clarify, in the currently committed version of the adapter, pushing down sort + limit does not work because CassandraSortRule only ever seems to see a Sort without a fetch.

      Show
      michaelmior Michael Mior added a comment - To clarify, in the currently committed version of the adapter, pushing down sort + limit does not work because CassandraSortRule only ever seems to see a Sort without a fetch.
      Hide
      michaelmior Michael Mior added a comment -

      Okay, take two of the PR. It removes the nasty combine rule for sorts but still depends on EnumerableLimit. There's now a separate CassandraLimit operator with a rule to create it from CassandraLimit. I also added some extra test cases and things seem to be working. If you have any suggestions on how to avoid touching EnumerableLimit I'd love to hear it but I've done a lot of playing around with this and I think it might be the best approach.

      Show
      michaelmior Michael Mior added a comment - Okay, take two of the PR. It removes the nasty combine rule for sorts but still depends on EnumerableLimit . There's now a separate CassandraLimit operator with a rule to create it from CassandraLimit . I also added some extra test cases and things seem to be working. If you have any suggestions on how to avoid touching EnumerableLimit I'd love to hear it but I've done a lot of playing around with this and I think it might be the best approach.
      Hide
      julianhyde Julian Hyde added a comment -

      Michael, I don't fully understand the issue, but I trust you. Feel free to commit when you're ready.

      If there are changes that we can make to the RelNode taxonomy long term, I'm open to suggestions.

      Show
      julianhyde Julian Hyde added a comment - Michael, I don't fully understand the issue, but I trust you. Feel free to commit when you're ready. If there are changes that we can make to the RelNode taxonomy long term, I'm open to suggestions.
      Hide
      michaelmior Michael Mior added a comment -

      Sounds good. I don't think this is ideal, but it's well-tested and it works and I think not too ugly so I'll commit for now and think about how this can be improved in the longer term. Thanks!

      Show
      michaelmior Michael Mior added a comment - Sounds good. I don't think this is ideal, but it's well-tested and it works and I think not too ugly so I'll commit for now and think about how this can be improved in the longer term. Thanks!
      Hide
      michaelmior Michael Mior added a comment -

      Fixed in 4c89dce.

      Show
      michaelmior Michael Mior added a comment - Fixed in 4c89dce .
      Hide
      julianhyde Julian Hyde added a comment -

      Resolved in release 1.8.0 (2016-06-13).

      Show
      julianhyde Julian Hyde added a comment - Resolved in release 1.8.0 (2016-06-13).

        People

        • Assignee:
          michaelmior Michael Mior
          Reporter:
          michaelmior Michael Mior
        • Votes:
          0 Vote for this issue
          Watchers:
          2 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development