Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
During the compilation/execution of a Hive command there are usually calls in the HiveMetastore (HMS). Most of the time these calls need to connect to the underlying database backend in order to return the requested information so they trigger the generation and execution of SQL queries.
We have a lot of code in Hive which affects the generation and execution of these SQL queries and some vivid examples are the MetaStoreDirectSql and CachedStore classes.
MetaStoreDirectSql is responsible for building explicitly SQL queries for performance reasons.
CachedStore is responsible for caching certain requests to avoid going to the database on every call.
Ensuring that the generated SQL is the expected one and/or that certain queries are hitting (or not) the DB is valuable for catching regressions or evaluating the effectiveness of caches.
The idea is that for each Hive command/query in some qtest there is an option to include in the output (.q.out) the list of SQL queries that were generated by HMS calls.
Attachments
Issue Links
- links to