Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
Impala 3.1.0
-
None
-
None
-
ghx-label-6
Description
A number of JIRAs have been filed about issues with, or improvements to, the expression rewrite rules implemented in the analyzer. Work in this area revealed that changes in this area are quite difficult because of the way the code is currently structured.
- The analyzer makes one pass over the AST to resolve references, compute types and so on.
- The rewrite rules make a pass over the tree to rewrite expressions.
- The prior analysis is discarded, and a second analysis pass is done over the rewritten expressions.
Work was done to handle this two-step process. Special code exists in some clauses to tuck away the underwritten SQL for use in error messages after rewrite, though the implementation is inconsistent across clauses.
The analyzer often makes copies of expressions for various purposes. It is quite hard to keep things in sync when doing rewrites as new copies must be made (as part of the second analysis pass.)
The goal of this ticket is to migrate rewrites into the expression analysis step. For each node:
- Analyze and rewrite children.
- Resolve references.
- Rewrite the node itself.
- Compute costs and selectivity.
Once completed, rewrites will be just another step in expression analysis. The analyzer will make just one analysis pass. Copies of expressions will be made after analysis/rewrite to that they stay in sync.
The work will be done as a series of small patches. Those that are just refactoring will use this this JIRA ticket with a (Part i) designation. Those that make functional changes have their own JIRA tickets.
Attachments
Issue Links
- split from
-
IMPALA-6590 Disable expr rewrites and codegen for VALUES() statements
- Resolved