Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-6232

Correlated subquery should not push to RegionServer as the probe side of the Hash join

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.15.0
    • 5.1.0, 4.16.0
    • None
    • None

    Description

      We were facing an interesting problem when a more complex query (with inner selects in the WHERE clause) succeeds alone, while the same query fails, if it is part of a join. I created a test table / query to reproduce the problem:

      DROP TABLE IF EXISTS test;
      CREATE TABLE test (
            id INTEGER NOT NULL,
            test_id INTEGER,
            lastchanged TIMESTAMP,
            CONSTRAINT my_pk PRIMARY KEY (id));
      
      UPSERT INTO test VALUES(0, 100, '2000-01-01 00:00:00.0');
      UPSERT INTO test VALUES(1, 101, '2000-01-01 00:00:00.0');
      UPSERT INTO test VALUES(2, 100, '2011-11-11 11:11:11.0');
      

      Query 1: Example query, running fine in itself:

      SELECT id, test_id, lastchanged FROM test T
      WHERE lastchanged = ( SELECT max(lastchanged) FROM test WHERE test_id = T.test_id )
      
      Returns:
      +----+---------+-----------------------+
      | ID | TEST_ID |      LASTCHANGED      |
      +----+---------+-----------------------+
      | 1  | 101     | 2000-01-01 01:00:00.0 |
      | 2  | 100     | 2011-11-11 12:11:11.0 |
      +----+---------+-----------------------+
      

      Query 2: Same query fails on the current master branch, when it is part of a larger (implicit) join:

      SELECT AAA.*
      FROM 
      (
        SELECT id, test_id, lastchanged FROM test T
        WHERE lastchanged = ( SELECT max(lastchanged) FROM test WHERE test_id = T.test_id )
      ) as AAA,
      (
        SELECT id FROM test
      ) as BBB
      WHERE AAA.id = BBB.id;
      
      
      java.lang.IllegalArgumentException
      	at org.apache.phoenix.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
      	at org.apache.phoenix.compile.TupleProjectionCompiler.createProjectedTable(TupleProjectionCompiler.java:66)
      	at org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:663)
      	at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(QueryCompiler.java:404)
      	at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(QueryCompiler.java:302)
      	at org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:249)
      	at org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:176)
      	at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:504)
      	at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:467)
      	at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309)
      	at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:298)
      	at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
      	at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:297)
      	at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:290)
      	at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1933)
      	at sqlline.Commands.executeSingleQuery(Commands.java:1054)
      	at sqlline.Commands.execute(Commands.java:1003)
      	at sqlline.Commands.sql(Commands.java:967)
      	at sqlline.SqlLine.dispatch(SqlLine.java:734)
      	at sqlline.SqlLine.begin(SqlLine.java:541)
      	at sqlline.SqlLine.start(SqlLine.java:267)
      	at sqlline.SqlLine.main(SqlLine.java:206)
      

      I am not sure what the problem is exactly. My guess is that Phoenix tries to optimize (flatten) an inner-query, which it shouldn't, if we are inside a join (according to the check in the code which throws the exception).

      The best workaround I found was to define an explicit join in the original query (Query 1), basically change the inner select into a join. This modified query return the same as the original one:

      Query 3:

      SELECT T.id, T.test_id, T.lastchanged 
      FROM 
        test T 
        LEFT JOIN (
          SELECT max(lastchanged) AS max_timestamp, 
                 test_id AS max_timestamp_test_id
          FROM test
          GROUP BY test_id
        ) JOIN_TABLE ON JOIN_TABLE.max_timestamp_test_id = T.test_id
      WHERE T.lastchanged = JOIN_TABLE.max_timestamp
      
      Returns:
      +------+-----------+-----------------------+
      | T.ID | T.TEST_ID |     T.LASTCHANGED     |
      +------+-----------+-----------------------+
      | 1    | 101       | 2000-01-01 01:00:00.0 |
      | 2    | 100       | 2011-11-11 12:11:11.0 |
      +------+-----------+-----------------------+
      

      Query 4: And the same modified query (query 3) now works inside a join:

      SELECT AAA.*
      FROM 
      (
        SELECT T.id, T.test_id, T.lastchanged 
        FROM 
          test T 
          LEFT JOIN (
            SELECT max(lastchanged) AS max_timestamp, 
                   test_id AS max_timestamp_test_id
            FROM test
            GROUP BY test_id
          ) JOIN_TABLE ON JOIN_TABLE.max_timestamp_test_id = T.test_id
        WHERE T.lastchanged = JOIN_TABLE.max_timestamp
      ) as AAA,
      (
        SELECT id FROM test
      ) as BBB
      WHERE AAA.id = BBB.id;
      
      Returns:
      +------+-----------+-----------------------+
      | T.ID | T.TEST_ID |     T.LASTCHANGED     |
      +------+-----------+-----------------------+
      | 1    | 101       | 2000-01-01 01:00:00.0 |
      | 2    | 100       | 2011-11-11 12:11:11.0 |
      +------+-----------+-----------------------+
      

      I think Query 4 worked, as it is forcing Phoenix to drop the idea of optimizing it's inner-query (Query 3). Although, I can be wrong about the root cause...

      Anyway, I think the bug should be fixed and Query 2 should run without exception.

      Attachments

        1. PHOENIX-6232_addendum-4.x.patch
          24 kB
          chenglei
        2. PHOENIX-6232_addendum-master.diff
          24 kB
          chenglei
        3. PHOENIX-6232_v1-4.x.patch
          69 kB
          chenglei
        4. PHOENIX-6232_v1-master.patch
          70 kB
          chenglei

        Issue Links

          Activity

            People

              comnetwork chenglei
              symat Mate Szalay-Beko
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: