Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
-
None
-
ghx-label-5
Description
The simplest solution to speed up the equivalence-class computation is to remove the equivalence class altogether.
A few points for justification:
- The purpose of the equivalence classes is to provide a condensed version of the value-transfer graph to speed up certain equivalence checks/transformations. All necessary information is already captured in the value-transfer graph, so the high cost of computing the equivalence classes is not justified.
- Incomplete/wrong definition of equivalence classes. In today's code, the equivalence class of a slot contains a list of slots where each pair of slots has a value transfer. This definition does not make sense, in particular, for unions which produce value transfers of the form s1 -> s2, s1 -> s3 where according to our definition s1 can only be in the same class as s2 or s3, but not both (because s2 and s3 do not have a value transfer). We could redefine the equivalence class of only containing slot pairs with mutual value transfers, but that information is already efficiently accessible from the value-transfer graph.
The challenge in removing the equivalence classes is to rework all the existing code that relies on them to use the value-transfer graph directly.
One of the tricky functions that needs to change is Analyzer.equivExprs(). An alternative implementation might look like this:
// Returns true if 'other' is equivalent to this. Equivalence means that the two expr trees are equal except for SlotRefs which match if there // is another SlotRef in the other expr tree at the same position, and the underlying slots have a value transfer. If 'mutualOnly' is true, // SlotRefs are only considered equivalent if they there is a mutual value transfer. boolean Expr.isEquiv(Expr other, boolean mutualOnly);
The existing equals() function can probably be refactored into a new function that can be called from both equals() and isEquiv() or something along those lines.
Attachments
Attachments
Issue Links
- incorporates
-
IMPALA-6213 The partitioning compatibility check is wrong in consecutive outer join cases
- Resolved