Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-4560

ORDER BY with GROUP BY doesn't work if there is WHERE on pk column

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.14.0
    • Fix Version/s: 4.14.0
    • Labels:
      None

      Description

      It's related to the optimizations for group by that were made in PHOENIX-3451. Test case to reproduce:

      CREATE TABLE IF NOT EXISTS VA_TEST(ID VARCHAR NOT NULL PRIMARY KEY, VAL1 VARCHAR, VAL2 INTEGER);
      
      UPSERT INTO VA_TEST VALUES('ABC','aa123', 11);
      UPSERT INTO VA_TEST VALUES('ABD','ba124', 1);
      UPSERT INTO VA_TEST VALUES('ABE','cf125', 13);
      UPSERT INTO VA_TEST VALUES('ABF','dan126', 4);
      UPSERT INTO VA_TEST VALUES('ABG','elf127', 15);
      UPSERT INTO VA_TEST VALUES('ABH','fan128', 6);
      UPSERT INTO VA_TEST VALUES('AAA','get211', 100);
      UPSERT INTO VA_TEST VALUES('AAB','hat212', 7);
      UPSERT INTO VA_TEST VALUES('AAC','aap12', 2);
      UPSERT INTO VA_TEST VALUES('AAD','ball12', 3);
      UPSERT INTO VA_TEST VALUES('AAE','inn2110', 13);
      UPSERT INTO VA_TEST VALUES('AAF','key2112', 40);
      select distinct ID, VAL1, VAL2 from VA_TEST where "ID" in ('ABC','ABD','ABE','ABF','ABG','ABH','AAA', 'AAB', 'AAC','AAD','AAE','AAF') order by VAL1 ASC;
      

      Execution result :

      +------+----------+-------+
      |  ID  |   VAL1   | VAL2  |
      +------+----------+-------+
      | AAA  | get211   | 100   |
      | AAB  | hat212   | 7     |
      | AAC  | aap12    | 2     |
      | AAD  | ball12   | 3     |
      | AAE  | inn2110  | 13    |
      | AAF  | key2112  | 40    |
      | ABC  | aa123    | 11    |
      | ABD  | ba124    | 1     |
      | ABE  | cf125    | 13    |
      | ABF  | dan126   | 4     |
      | ABG  | elf127   | 15    |
      | ABH  | fan128   | 6     |
      +------+----------+-------+
      

      Explain plan:

      +----------------------------------------------------------------------------------------+-----------------+----------------+----------------+
      |                                          PLAN                                          | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
      +----------------------------------------------------------------------------------------+-----------------+----------------+----------------+
      | CLIENT 1-CHUNK 12 ROWS 1908 BYTES PARALLEL 1-WAY POINT LOOKUP ON 12 KEYS OVER VA_TEST  | 1908            | 12             | 1516917709099  |
      |     SERVER AGGREGATE INTO DISTINCT ROWS BY ["ID", "VAL1", "VAL2"]                      | 1908            | 12             | 1516917709099  |
      | CLIENT MERGE SORT                                                                      | 1908            | 12             | 1516917709099  |
      +----------------------------------------------------------------------------------------+-----------------+----------------+----------------+
      

      As we can see there is no client side sort, so it keeps the original order we get from server which is wrong.
      That happens because of OrderPreservingTracker.hasEqualityConstraints decides that WHERE clause on ID column is constant and let us perform the optimization and skip the client side sort.
      James Taylor, chenglei any thoughts?

        Attachments

        1. PHOENIX-4560_v3.patch
          10 kB
          James Taylor
        2. PHOENIX-4560_v2.patch
          6 kB
          James Taylor
        3. PHOENIX-4560_v1.patch
          3 kB
          James Taylor

          Activity

            People

            • Assignee:
              jamestaylor James Taylor
              Reporter:
              sergey.soldatov Sergey Soldatov
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: