Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16026 Cost-based Optimizer Framework
  3. SPARK-19350

Cardinality estimation of Limit and Sample

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.2.0
    • SQL
    • None

    Description

      Currently, LocalLimit/GlobalLimit/Sample propagates the same row count and column stats from its child, which is incorrect.
      We can get the correct rowCount in Statistics for Limit/Sample whether cbo is enabled or not. And column stats should not be propagated because we don't know the distribution of columns after Limit or Sample.

      Attachments

        Activity

          People

            ZenWzh Zhenhua Wang
            ZenWzh Zhenhua Wang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: