Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-3156

DistinctPrefixFilter optimization produces incorrect results with some non-pk WHERE conditions

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.8.0
    • Labels:
      None

      Description

      There's a corner case I found where a DISTINCT and GROUP BY query along a prefix of a compound row key might return incorrect results.

      The filter relies on seeing the _0 column absolutely last, and not seeing all Cells that should be filtered. That break in two scenarios:

      1. we have a table with key (key1, key2, key3) and columns (c1 and c2). Now construct a WHERE <a clause that always matches c1>, <a clause that filters by c2) GROUP BY key1, key2. Now the filter would mis-skip when it sees the Cell for c1.
      2. we force lower key column names. In that case those would sort after the _0 column. The DistinctPrefixFilter would see the _0 column first and skip.

      In both case we are effectively changing the order in which the filters are applied. The DistinctPrefixFilter is no longer for the row.

      I can fix #1 (by ignoring all Cells other than then _0 one). I do not know how to fix case #2.

      I think this is a blocker and we may have to undo the entire DISTINCT and GROUP BY prefix optimization.

      Ankit Singhal, James Taylor, Samarth Jain.

        Attachments

        1. 3156.txt
          3 kB
          Lars Hofhansl
        2. 3156-v2.txt
          5 kB
          Lars Hofhansl

          Issue Links

            Activity

              People

              • Assignee:
                lhofhansl Lars Hofhansl
                Reporter:
                lhofhansl Lars Hofhansl
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: