Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-2601

Query result is incorrect when both index hint and limit are used

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.4.0
    • 4.7.0
    • None
    • None
    • Linux

    Description

      Query result is incorrect when both index hint and limit are used.

      To reproduce the problem,
      1. Create an HBase table with a column family. The column family has 4 columns (a1, a2, a3, a4). Create an index on (a1, a2).

      2. Populate the table with the following data

      a1  a2     a3      a4
      --  -----  -----   -----
      1   Small  Red     USA
      1   Small  Yellow  UK
      1   Small  Green   China
      

      3. Run the following Phoenix queries, and the query results are correct.

      select a1, a2, a3, a4 from table where a1 = '1' and a2 = 'Small' ;
      

      Three rows are returned as expected

      select a1, a2, a3, a4 from table where a1 = '1' and a2 = 'Small' and a3 = "Yellow" limit 1;
      

      One row is returned as expected

      select /*+ INDEX(add_index_hint_here) */ a1, a2, a3, a4 from table where a1 = '1' and a2 = 'Small' and a3 = "Yellow";
      

      One row is returned as expected

      4. However, with the combination of index hint and limit clause, the query result is incorrect.

      select /*+ INDEX(add_index_hint_here) */ a1, a2, a3, a4 from table where a1 = '1' and a2 = 'Small' and a3 = "Yellow" limit 1;
      

      Zero row is returned. The expected result is 1 row (i.e. the second row in the example).

      The explain plan of the problematic query

      +------------------------------------------+
      |                   PLAN                   |
      +------------------------------------------+
      | CLIENT 11-CHUNK PARALLEL 1-WAY ... MY_TABLE |
      |     SERVER FILTER BY A.A3 = 'Yellow' |
      | CLIENT 1 ROW LIMIT                       |
      |     SKIP-SCAN-JOIN TABLE 0               |
      |         CLIENT 1-CHUNK SERIAL 1-WAY RANGE SCAN OVER MY_INDEX [[52,48,48,45,69,79,84,45,48,48,57],23] |
      |             SERVER FILTER BY FIRST KEY ONLY |
      |             SERVER 1 ROW LIMIT           |
      |         CLIENT 1 ROW LIMIT               |
      |     DYNAMIC SERVER FILTER BY ("MY_ROW_KEY") IN (($148.$150, $1 |
      |     JOIN-SCANNER 1 ROW LIMIT             |
      +------------------------------------------+
      

      It looks like the query plan uses the index first. Although all three rows match the index (a1 = '1' and 'a2 = 'Small'), but the intermediate result only has the first row due the "limit 1" clause. In this case, the intermediate result has one row (a3 = 'Red'). Then the query engine tries to apply the (a3 = 'Yellow') to the intermediate result, and get zero row as the final result.

      Attachments

        1. PHOENIX-2601.patch
          4 kB
          Wei Xue

        Activity

          People

            maryannxue Wei Xue
            simonlee Simon Lee
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: