Description
In SPARK-10337, we added the first step of supporting view natively, which is basically wrapping the original view definition SQL text with an extra SELECT and then store the wrapped SQL text into metastore. This approach suffers at least two issues:
- Switching current database may break view queries
- HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view definition
To fix these issues, we need to canonicalize the view definition. For example, for a SQL string
SELECT a, b FROM table
we will save this text to Hive metastore as
SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table`
The core infrastructure of this work is SQL query string generation (SPARK-12593). Namely, converting resolved logical query plans back to canonicalized SQL query strings. PR #10541 set up basic infrastructure of SQL generation, but more language structures need to be supported.
PR #10541 added round-trip testing infrastructure for SQL generation. All queries tested by test suites extending HiveComparisonTest are executed in the following order:
- Parsing query string to logical plan
- Converting resolved logical plan back to canonicalized SQL query string
- Executing generated SQL query string
- Comparing query results with golden answers
Note that not all resolved logical query plan can be converted back to SQL query string. Either because it consists of some language structure that has not been supported yet, or it doesn't have a SQL representation inherently (e.g. query plans built on top of local Scala collections).
If a logical plan is inconvertible, HiveComparisonTest falls back to its original behavior, namely executing the original SQL query string and compare the results with golden answers.
SQL generation details are logged and can be found in sql/hive/target/unit-tests.log (log level should be at least DEBUG).
Attachments
Attachments
Issue Links
- is duplicated by
-
SPARK-11148 Unable to create views
- Resolved
- is related to
-
SPARK-25797 Views created via 2.1 cannot be read via 2.2+
- Resolved
-
SPARK-18209 More robust view canonicalization without full SQL expansion
- Resolved
- relates to
-
SPARK-12593 Convert basic resolved logical plans back to SQL query strings
- Resolved
-
SPARK-11148 Unable to create views
- Resolved
-
SPARK-12726 ParquetConversions doesn't always propagate metastore table identifier to ParquetRelation
- Resolved
-
SPARK-14038 Enable native view by default
- Resolved
-
SPARK-16576 Move plan SQL generation code from SQLBuilder into logical operators
- Closed