Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2169

Conduct a comparative performance study of the framework


    • Type: Task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core
    • Labels:
    • Environment:

      Use Calcite Benchmark, and run it on the Benchmark environment. 


      Design and implement a study of the Calcite framework using benchmark that is to be developed for CALCITE-2168 (Implement a General Purpose Benchmark for Calcite), and run a comparative analysis of the performance of the Calcite optimizer, and the performance of the queries under Calcite optimized and un-optimized, and in comparison to standalone databases, or other frameworks.

      Some ideas and targets for the study:

      • Planning and execution time with queries that span across multiple systems (e.g. Postgres and Cassandra, Postgres and Pig, Pig and Cassandra).
      • for TCP-DS, study the plan produced by Calcite vs. existing RDBMS optimizers (e.g. Postgres, MySQL). This would be interesting even as a
        feature to use in conjunction with the lattice framework to decide what queries to eventually build lattices as an estimation of time savings.
      • Optimizer runtime for complex queries (we could also compare with the runtime of executing the optimized query directly)
      • Calcite optimized query
      • Unoptimized query with the optimizer of the backend disabled
      • Unoptimized query with the optimizer of the backend enabled
      • Comparison with other federated query processing engines such as Spark SQL, PrestoDB, and maybe KSQL[1] and InfluxDB
      • Uses Calcite to optimize Spark queries [2]

      [1] https://github.com/confluentinc/ksql
      [2] https://www.datascience.com/blog/grunion-data-science-tools-query-optimizer-apache-spark




            • Assignee:
              ebegoli Edmon Begoli
              ebegoli Edmon Begoli
            • Votes:
              3 Vote for this issue
              6 Start watching this issue


              • Created:

                Time Tracking

                Original Estimate - 2,016h
                Remaining Estimate - 2,016h
                Time Spent - Not Specified
                Not Specified