Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 4.4.0
-
None
-
ghx-label-10
Description
Since the tuple and slot information is kept separately in the descriptor table, it does not get incorporated into the PlanNode thrift used for the tuple cache key. This means that the tuple cache can't distinguish between these two queries:
select int_col1 from table; select int_col2 from table;
To solve this, the tuple/slot information needs to be incorporated into the cache key. PlanNode::initThrift() walks through each tuple, so this is a good place to serialize the TupleDescriptor/SlotDescriptors and incorporate it into the hash.
The tuple ids and slot ids are global ids, so the value is influenced by the entirety of the query. This is a problem for matching cache results across different queries. As part of incorporating the tuple/slot information, we should also add an ability to translate tuple/slot ids into ids local to a subtree.