Uploaded image for project: 'IMPALA'
  2. IMPALA-4916

Missing, redundant or non-evaluable predicates due to buggy equivalence classes.



    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Frontend
    • Labels:


      Impala's equivalence class computation has a subtle bug which can lead to:
      1. omitting predicates
      2. adding redundant predicates
      3. adding predicates that are non-evaluable at that point in the plan

      In most queries, the bug has no effect on the final plan.
      However, in case (1) incorrect results may be returned, and in case (3) a crash will occur.

      Unfortunately, it is extremely difficult to determine from a query when this bug is being hit because the bug may or may not trigger depending on the specific implementation of Java's HashMap which has a tendency to slightly change across JVM versions. It also depends on the total number of columns (including virtual view columns) in the query.
      For queries hitting this bug, even minor changes that do not affect the end result are enough to make them not hit this bug (e.g., changing a '*" to an explicit list of fewer columns).

      The root cause is a bug in Impala's DisjointSet implementation which is used for computing equivalence classes.

      Even minor query modifications that do not affect the query result might be enough to fix a query. For example, changing a '*' to an explicit list of (fewer) columns may be enough. Likewise, adding column references in places where they are not needed, e.g., in a EXISTS or NOT EXISTS subquery may fix the problem.




            • Assignee:
              alex.behm Alexander Behm
              alex.behm Alexander Behm
            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: