Affects Version/s: None
Fix Version/s: None
Use Calcite Benchmark, and run it on the Benchmark environment.
Design and implement a study of the Calcite framework using benchmark that is to be developed for CALCITE-2168 (Implement a General Purpose Benchmark for Calcite), and run a comparative analysis of the performance of the Calcite optimizer, and the performance of the queries under Calcite optimized and un-optimized, and in comparison to standalone databases, or other frameworks.
Some ideas and targets for the study:
- Planning and execution time with queries that span across multiple systems (e.g. Postgres and Cassandra, Postgres and Pig, Pig and Cassandra).
- for TCP-DS, study the plan produced by Calcite vs. existing RDBMS optimizers (e.g. Postgres, MySQL). This would be interesting even as a
feature to use in conjunction with the lattice framework to decide what queries to eventually build lattices as an estimation of time savings.
- Optimizer runtime for complex queries (we could also compare with the runtime of executing the optimized query directly)
- Calcite optimized query
- Unoptimized query with the optimizer of the backend disabled
- Unoptimized query with the optimizer of the backend enabled
- Comparison with other federated query processing engines such as Spark SQL, PrestoDB, and maybe KSQL and InfluxDB
- Uses Calcite to optimize Spark queries