Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.0
    • Component/s: None
    • Labels:
      None

      Description

      I've started work on an adapter for Apache Cassandra.

      There's still a fair bit of work to do, but you can successfully issue a fairly broad class of queries with filtering, sorting, and projections pushed down to Cassandra in many cases.

      Progress can be tracked on GitHub. Below is a brief list of things which still need to be done. I'm hoping this can be useful to others, so it would be good to get a sense of what would be considered complete for future release.

      To do

      • New tests for test suite (and update calcite-test-dataset to support Cassandra)
      • Allow for partial application of filter predicates (since Cassandra's query language is so limited, this will avoid the case where only trivial predicates can be pushed down)
      • Allow for partial sorting (for the same reason as above)
      • Proper quoting of identifiers
      • Fix projections to avoid projecting unnecessary columns in some circumstances
      • Proper cost modelling
      • Exploit native aggregation
      • Documentation
      • Correct literal formatting (e.g. dates, timestamps, etc.)

        Issue Links

          Activity

          Hide
          julianhyde Julian Hyde added a comment -

          Michael Mior, Fantastic! I have been hoping that someone would do a Cassandra adapter.

          What can I do to help?

          I think that adding to calcite-test-dataset is key; it will allow us to start running regular tests, and will allow people to experiment with the adapter. Vladimir Sitnikov may be able to help with that.

          Show
          julianhyde Julian Hyde added a comment - Michael Mior , Fantastic! I have been hoping that someone would do a Cassandra adapter. What can I do to help? I think that adding to calcite-test-dataset is key; it will allow us to start running regular tests, and will allow people to experiment with the adapter. Vladimir Sitnikov may be able to help with that.
          Hide
          julianhyde Julian Hyde added a comment -

          Ah, I see you've already made a couple of PRs to Vladimir Sitnikov. https://github.com/vlsi/calcite-test-dataset/pull/3

          Show
          julianhyde Julian Hyde added a comment - Ah, I see you've already made a couple of PRs to Vladimir Sitnikov . https://github.com/vlsi/calcite-test-dataset/pull/3
          Hide
          michaelmior Michael Mior added a comment -

          Julian Hyde Yes, getting those other modules up-to-date just happens to make it easier to get the Cassandra Puppet module I was using working. I don't have tests ready yet, but hopefully I'll get the chance to do that in the next week or two.

          Show
          michaelmior Michael Mior added a comment - Julian Hyde Yes, getting those other modules up-to-date just happens to make it easier to get the Cassandra Puppet module I was using working. I don't have tests ready yet, but hopefully I'll get the chance to do that in the next week or two.
          Hide
          michaelmior Michael Mior added a comment -

          In terms of help, I wouldn't mind some code review of what I have so far to see if the approach I'm taking makes sense.

          In the mean time, I think I'll focus on getting a good set of tests going since I agree it would make it easier for others to start playing with it.

          Show
          michaelmior Michael Mior added a comment - In terms of help, I wouldn't mind some code review of what I have so far to see if the approach I'm taking makes sense. In the mean time, I think I'll focus on getting a good set of tests going since I agree it would make it easier for others to start playing with it.
          Hide
          julianhyde Julian Hyde added a comment -

          Makes sense. If you have a data set and a few tests, you should submit a pull request. I think this can come into the main line fairly soon. It doesn't have to be perfect. People can start using it and maybe even improving it.

          Show
          julianhyde Julian Hyde added a comment - Makes sense. If you have a data set and a few tests, you should submit a pull request. I think this can come into the main line fairly soon. It doesn't have to be perfect. People can start using it and maybe even improving it.
          Hide
          michaelmior Michael Mior added a comment -

          Pull request created. Eager to get some feedback. Hopefully others will be able to find this useful

          Show
          michaelmior Michael Mior added a comment - Pull request created . Eager to get some feedback. Hopefully others will be able to find this useful
          Hide
          julianhyde Julian Hyde added a comment -

          I'm on vacation until Monday but I'll take a look as soon as I can. I'll ask others to do the same.

          Show
          julianhyde Julian Hyde added a comment - I'm on vacation until Monday but I'll take a look as soon as I can. I'll ask others to do the same.
          Hide
          michaelmior Michael Mior added a comment -

          Thanks Julian Hyde!

          Show
          michaelmior Michael Mior added a comment - Thanks Julian Hyde !
          Hide
          julianhyde Julian Hyde added a comment -

          I'm reviewing now. It's looking very, very good. I've been able to load the VM from your modified calcite-test-dataset and run the integration test. I'm running the full suite now – I think I see some javadoc errors, but I'll follow up when the suite finishes.

          Before we commit to Apache master, you'll need to get your changes to calcite-test-dataset accepted. Can you create a pull request for https://github.com/michaelmior/calcite-test-dataset and work with Vladimir Sitnikov? I haven't reviewed calcite-test-dataset thoroughly, but it would be useful if the README.md had a 'Accessing Cassandra in the VM' section similar to https://github.com/vlsi/calcite-test-dataset#accessing-mysql-in-the-vm.

          Show
          julianhyde Julian Hyde added a comment - I'm reviewing now. It's looking very, very good. I've been able to load the VM from your modified calcite-test-dataset and run the integration test. I'm running the full suite now – I think I see some javadoc errors, but I'll follow up when the suite finishes. Before we commit to Apache master, you'll need to get your changes to calcite-test-dataset accepted. Can you create a pull request for https://github.com/michaelmior/calcite-test-dataset and work with Vladimir Sitnikov ? I haven't reviewed calcite-test-dataset thoroughly, but it would be useful if the README.md had a 'Accessing Cassandra in the VM' section similar to https://github.com/vlsi/calcite-test-dataset#accessing-mysql-in-the-vm .
          Hide
          michaelmior Michael Mior added a comment -

          Will do. I realized I haven't made sure that the Splunk provisioning is still working so I'd like to do that first to avoid breaking anything. I'll update once I have a PR ready for the test VM.

          Show
          michaelmior Michael Mior added a comment - Will do. I realized I haven't made sure that the Splunk provisioning is still working so I'd like to do that first to avoid breaking anything. I'll update once I have a PR ready for the test VM.
          Hide
          michaelmior Michael Mior added a comment -

          Pull request here.

          Show
          michaelmior Michael Mior added a comment - Pull request here .
          Hide
          julianhyde Julian Hyde added a comment -

          I managed to get the integration tests to run by rebasing to latest master, and by fixing a couple of javadoc errors; see https://github.com/julianhyde/calcite/tree/1080-cassandra, and pull those fixes into your branch. mvn javadoc:javadoc needs to run without any errors appearing in the output (warnings are OK).

          There are errors when running the failsafe-test-postgresql integration test; try running JdbcTest with -Dcalcite.test.db=postgresql and you will get about 48 failures. I think it's a problem with postgres in the VM.

          Show
          julianhyde Julian Hyde added a comment - I managed to get the integration tests to run by rebasing to latest master, and by fixing a couple of javadoc errors; see https://github.com/julianhyde/calcite/tree/1080-cassandra , and pull those fixes into your branch. mvn javadoc:javadoc needs to run without any errors appearing in the output (warnings are OK). There are errors when running the failsafe-test-postgresql integration test; try running JdbcTest with -Dcalcite.test.db=postgresql and you will get about 48 failures. I think it's a problem with postgres in the VM.
          Hide
          michaelmior Michael Mior added a comment -

          Pulled in your change. Thanks for the note about checking for correct Javadocs. As for the tests, I just happened to be checking the same thing and I saw the same errors. However, I see the same failures when I use the current master.of calcite-test-dataset as well. I'll switch back to the Calcite master branch and see if the errors are caused by any of the changes I made to Calcite.

          Show
          michaelmior Michael Mior added a comment - Pulled in your change. Thanks for the note about checking for correct Javadocs. As for the tests, I just happened to be checking the same thing and I saw the same errors. However, I see the same failures when I use the current master.of calcite-test-dataset as well. I'll switch back to the Calcite master branch and see if the errors are caused by any of the changes I made to Calcite.
          Hide
          michaelmior Michael Mior added a comment -

          As an aside, what's the full command line to just run JdbcTest? I'm not too familiar with Maven so I've just been running the whole test suite.

          Show
          michaelmior Michael Mior added a comment - As an aside, what's the full command line to just run JdbcTest? I'm not too familiar with Maven so I've just been running the whole test suite.
          Hide
          julianhyde Julian Hyde added a comment -

          Run cd core; mvn -Dtest=JdbcTest test. You can add -Pit etc. if you like.

          Show
          julianhyde Julian Hyde added a comment - Run cd core; mvn -Dtest=JdbcTest test . You can add -Pit etc. if you like.
          Hide
          michaelmior Michael Mior added a comment -

          Great, thanks

          Show
          michaelmior Michael Mior added a comment - Great, thanks
          Hide
          julianhyde Julian Hyde added a comment -

          Now the changes to calcite-test-dataset are in, we can proceed with this PR. Can you please squash your commits into a single commit, and edit site/_docs/adapter.md if you haven't already. Then I'll run a final test and I think this can go in.

          Show
          julianhyde Julian Hyde added a comment - Now the changes to calcite-test-dataset are in, we can proceed with this PR. Can you please squash your commits into a single commit, and edit site/_docs/adapter.md if you haven't already. Then I'll run a final test and I think this can go in.
          Hide
          michaelmior Michael Mior added a comment -

          Edited the adapter documentation and squashed the commits. PR is updated on GitHub.

          Show
          michaelmior Michael Mior added a comment - Edited the adapter documentation and squashed the commits. PR is updated on GitHub.
          Hide
          julianhyde Julian Hyde added a comment -

          Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/91887366. Thank you for this excellent contribution, Michael Mior!

          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/91887366 . Thank you for this excellent contribution, Michael Mior !
          Hide
          michaelmior Michael Mior added a comment -

          Thanks for the quick merge Julian Hyde! At some point in the future I'll probably open a few new issues to track some of the TODOs I mentioned here.

          Show
          michaelmior Michael Mior added a comment - Thanks for the quick merge Julian Hyde ! At some point in the future I'll probably open a few new issues to track some of the TODOs I mentioned here.
          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.7.0 (2016-03-22).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.7.0 (2016-03-22).

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              michaelmior Michael Mior
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development