Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38852 Better Data Source V2 operator pushdown framework
  3. SPARK-38560

If `Sum`, `Count`, `Any` accompany distinct, cannot do partial agg push down.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0
    • SQL
    • None

    Description

      Spark could partial push down sum(distinct col), count(distinct col) if data source have multiple partitions, and Spark will sum the value again.

      So the result may not correctly.

      Attachments

        Activity

          People

            beliefer Jiaan Geng
            beliefer Jiaan Geng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: