• Type: Sub-task
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
    • Fix Version/s: Impala 2.11.0
    • Component/s: Frontend
    • Labels:
    • Epic Color:


      The simplest solution to speed up the equivalence-class computation is to remove the equivalence class altogether.

      A few points for justification:

      • The purpose of the equivalence classes is to provide a condensed version of the value-transfer graph to speed up certain equivalence checks/transformations. All necessary information is already captured in the value-transfer graph, so the high cost of computing the equivalence classes is not justified.
      • Incomplete/wrong definition of equivalence classes. In today's code, the equivalence class of a slot contains a list of slots where each pair of slots has a value transfer. This definition does not make sense, in particular, for unions which produce value transfers of the form s1 -> s2, s1 -> s3 where according to our definition s1 can only be in the same class as s2 or s3, but not both (because s2 and s3 do not have a value transfer). We could redefine the equivalence class of only containing slot pairs with mutual value transfers, but that information is already efficiently accessible from the value-transfer graph.

      The challenge in removing the equivalence classes is to rework all the existing code that relies on them to use the value-transfer graph directly.

      One of the tricky functions that needs to change is Analyzer.equivExprs(). An alternative implementation might look like this:

      // Returns true if 'other' is equivalent to this. Equivalence means that the two expr trees are equal except for SlotRefs which match if there
      // is another SlotRef in the other expr tree at the same position, and the underlying slots have a value transfer. If 'mutualOnly' is true,
      // SlotRefs are only considered equivalent if they there is a mutual value transfer.
      boolean Expr.isEquiv(Expr other, boolean mutualOnly);

      The existing equals() function can probably be refactored into a new function that can be called from both equals() and isEquiv() or something along those lines.


        1. queryB.sql
          155 kB
          Tianyi Wang
        2. queryA.sql
          155 kB
          Tianyi Wang

          Issue Links



              • Assignee:
                tianyiwang Tianyi Wang
                alex.behm Alexander Behm
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: