Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23167

Update TPCDS queries from v1.4 to v2.7 (latest)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.2.1
    • 2.4.0
    • SQL
    • None

    Description

      We currently use TPCDS v1.4 (https://github.com/apache/spark/commits/master/sql/core/src/test/resources/tpcds) though, the latest one is v2.7 (http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp). I found that some queries are different from v1.4 and v2.7 (e.g., q4, q5, q6, ...) and some queries newly might appear (e.g., q10a, ..). I think it might make some sense to update the queries for more correct evaluation.

      Raw generated queries from TPCDS v2.7 query templates:
      https://github.com/maropu/spark_tpcds_v2.7.0/tree/master/generated

      Modified TPCDS v2.7 queries to pass TPCDSQuerySuite (e.g., replacing unsupported syntaxes, + 14 days -> interval 14 days):
      https://github.com/apache/spark/compare/master...maropu:TPCDSV2_7

       

      Attachments

        Issue Links

          Activity

            People

              maropu Takeshi Yamamuro
              maropu Takeshi Yamamuro
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: