Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41035

Incorrect results or NPE when a literal is reused across distinct aggregations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.2, 3.3.1, 3.4.0
    • 3.2.3, 3.3.2, 3.4.0
    • SQL

    Description

      This query produces incorrect results:

      select a, count(distinct 100) as cnt1, count(distinct b, 100) as cnt2
      from values (1, 2), (4, 5) as data(a, b)
      group by a;
      
      +---+----+----+
      |a  |cnt1|cnt2|
      +---+----+----+
      |1  |1   |0   |
      |4  |1   |0   |
      +---+----+----+
      

      The values for cnt2 should be 1 and 1 (not 0 and 0).

      If you change the literal used in the first aggregate function, the second aggregate function now works correctly:

      select a, count(distinct 101) as cnt1, count(distinct b, 100) as cnt2
      from values (1, 2), (4, 5) as data(a, b)
      group by a;
      
      +---+----+----+
      |a  |cnt1|cnt2|
      +---+----+----+
      |1  |1   |1   |
      |4  |1   |1   |
      +---+----+----+
      

      The same bug causes the following query to get a NullPointerException:

      select a, count(distinct 1), count_min_sketch(distinct b, 0.5d, 0.5d, 1)
      from values (1, 2), (4, 5) as data(a, b)
      group by a;
      

      If you change the literal used in the first aggregation, then the query succeeds:

      select a, count(distinct 2), count_min_sketch(distinct b, 0.5d, 0.5d, 1)
      from values (1, 2), (4, 5) as data(a, b)
      group by a;
      
      +---+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      |a  |count(DISTINCT 2)|count_min_sketch(DISTINCT b, 0.5, 0.5, 1)                                                                                                                                            |
      +---+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      |1  |1                |[00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 04 00 00 00 00 5D 8D 6A B9 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00]|
      |4  |1                |[00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 04 00 00 00 00 5D 8D 6A B9 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00]|
      +---+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      

      Attachments

        Activity

          People

            bersprockets Bruce Robbins
            bersprockets Bruce Robbins
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: