Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-11448

SQL: Wrong results of select with aggregates in subquery

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • sql

    Description

      Subqueries with aggregates may return wrong results due to incorrect splitting.

      Let's consider a table person:

      SELECT id, firstName FROM person:
      [1, firstName1], 
      [2, firstName2], 
      [3, firstName3], 
      [4, firstName4], 
      [5, firstName5], 
      [6, firstName6], 
      [7, firstName7], 
      [8, firstName8], 
      [9, firstName9], 
      [10, firstName10]
      

      The result of query SELECT COUNT(*) FROM person is 10, which is correct.
      The result of query SELECT * FROM person WHERE id = 10 is [10, firstName10], which is also correct.
      But the result of the query SELECT * FROM person WHERE id = (SELECT COUNT(*) FROM person) is [1, firstName1] which is completely wrong.

      The root cause of this behavior is the incorrect query splitting. The latest query is split into these parts:
      Map:

      SELECT
      __Z0.ID __C0_0,
      __Z0.FIRSTNAME __C0_1
      FROM PUBLIC.PERSON __Z0
      WHERE __Z0.ID = (SELECT
      COUNT(*)
      FROM PUBLIC.PERSON __Z1)
      

      Reduce:

      SELECT
      __C0_0 ID,
      __C0_1 FIRSTNAME
      FROM PUBLIC.__T0
      

      As we can see, aggregate COUNT(*) is calculated locally on each map node instead of calculating a single global aggregate and then using it in predicate.

      Reproducer is attached.

      Attachments

        1. Subquery reproducer
          3 kB
          Roman Kondakov

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rkondakov Roman Kondakov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: