Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4997

Keep APPROX_COUNT_DISTINCT in some SqlDialects

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Resolved
    • 1.29.0
    • 1.30.0
    • core

    Description

      Summary:  Some engines(Hive,Spark,BigQuery,Oracle,Snowflake) support APPROX_COUNT_DISTINCT function, while others do not. So we can use the parameter SqlDialect#supportsApproxCountDistinct to control whether to use APPROX_COUNT_DISTINCT(It is the same as APPROX_DISTINCT for Presto).


      Problem: Before fix for all SqlDialects

      SELECT APPROX_COUNT_DISTINCT(product_id)
      FROM foodmart.product
      

      will be 

      SELECT COUNT(DISTINCT product_id)
      FROM foodmart.product
      

      This can cause many tasks to run too slowly.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jiajunbernoulli Jiajun Xie
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m