Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21975 Histogram support in cost-based optimizer
  3. SPARK-17997

Aggregation function for counting distinct values for multiple intervals

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • None
    • SQL
    • None

    Description

      This is for computing ndv's for bins in equi-height histograms. A bin consists of two endpoints which form an interval of values and the ndv in that interval. For computing histogram statistics, after getting the endpoints, we need an agg function to count distinct values in each interval.

      Attachments

        Issue Links

          Activity

            People

              ZenWzh Zhenhua Wang
              ZenWzh Zhenhua Wang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: