Description
We currently use TPCDS v1.4 (https://github.com/apache/spark/commits/master/sql/core/src/test/resources/tpcds) though, the latest one is v2.7 (http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp). I found that some queries are different from v1.4 and v2.7 (e.g., q4, q5, q6, ...) and some queries newly might appear (e.g., q10a, ..). I think it might make some sense to update the queries for more correct evaluation.
Raw generated queries from TPCDS v2.7 query templates:
https://github.com/maropu/spark_tpcds_v2.7.0/tree/master/generated
Modified TPCDS v2.7 queries to pass TPCDSQuerySuite (e.g., replacing unsupported syntaxes, + 14 days -> interval 14 days):
https://github.com/apache/spark/compare/master...maropu:TPCDSV2_7
Attachments
Issue Links
- is related to
-
SPARK-24111 Add TPCDS v2.7 (latest) queries in TPCDSQueryBenchmark
- Resolved
- links to