Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16026 Cost-based Optimizer Framework
  3. SPARK-20678

Ndv for columns not in filter condition should also be updated

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • SQL
    • None

    Description

      In filter estimation, we update column stats for those columns in filter condition. However, if the number of rows decreases after the filter (i.e. the overall selectivity is less than 1), we need to update (scale down) the number of distinct values (NDV) for all columns, no matter they are in filter conditions or not.

      Attachments

        Activity

          People

            ZenWzh Zhenhua Wang
            ZenWzh Zhenhua Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: