Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32761

Planner error when aggregating multiple distinct Constant columns

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 3.0.2, 3.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      SELECT COUNT(DISTINCT 2), COUNT(DISTINCT 2, 3) will trigger this bug.

      The problematic code is:

       

      val distinctAggGroups = aggExpressions.filter(_.isDistinct).groupBy { e =>
        val unfoldableChildren = e.aggregateFunction.children.filter(!_.foldable).toSet
        if (unfoldableChildren.nonEmpty) {
          // Only expand the unfoldable children
           unfoldableChildren
        } else {
          // If aggregateFunction's children are all foldable
          // we must expand at least one of the children (here we take the first child),
          // or If we don't, we will get the wrong result, for example:
          // count(distinct 1) will be explained to count(1) after the rewrite function.
          // Generally, the distinct aggregateFunction should not run
          // foldable TypeCheck for the first child.
          e.aggregateFunction.children.take(1).toSet
        }
      }
      

        Attachments

          Activity

            People

            • Assignee:
              liulinhong Liu, Linhong
              Reporter:
              linhongliu-db Linhong Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: