Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Besides the security issue, there are a few more problems with using Hive tables to materialize common subexpressions:
- Query caching has to be disabled, since we drop the table as a query ends, but a cached query would reference the same dropped table.
- Executor frees resources of a prepared statement, which drops the table. Then the client executes this statement again.
- Issue with query caching and running two queries with identical plans at the same time. They would conflict in the same table.
Dave Birdsall suggested a solution in a comment in PR772: Create a Hive table at runtime for each statement and bind the name of it at runtime as well. This approach would fix all of the issues listed here.