[SPARK-15691] Refactor and improve Hive support - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: None
Fix Version/s: None
Component/s: SQL
Labels:
- bulk-closed

Target Version/s:

3.0.0

Description

Hive support is important to Spark SQL, as many Spark users use it to read from Hive. The current architecture is very difficult to maintain, and this ticket tracks progress towards getting us to a sane state.

A number of things we want to accomplish are:

Move the Hive specific catalog logic into HiveExternalCatalog.
- Remove HiveSessionCatalog. All Hive-related stuff should go into HiveExternalCatalog. This would require moving caching either into HiveExternalCatalog, or just into SessionCatalog.
- Move using properties to store data source options into HiveExternalCatalog (So, for a CatalogTable returned by HiveExternalCatalog, we do not need to distinguish tables stored in hive formats and data source tables).
- Potentially more.
Remove HIve's specific ScriptTransform implementation and make it more general so we can put it in sql/core.
Implement HiveTableScan (and write path) as a data source, so we don't need a special planner rule for HiveTableScan.
Remove HiveSharedState and HiveSessionState.

One thing that is still unclear to me is how to work with Hive UDF support. We might still need a special planner rule there.

Attachments

Issue Links

is related to

SPARK-15777 Catalog federation

Resolved

relates to

SPARK-24814 Relationship between catalog and datasources

Resolved

SPARK-17861 Store data source partitions in metastore and push partition pruning into metastore

Resolved

SPARK-14825 Merge functionality in Hive module into SQL core module

Resolved

Sub-Tasks

1.	Implement ScriptTransformation in sql/core	Resolved	Unassigned
2.	Converge the insert path of Hive tables with data source tables	Resolved	Wenchen Fan
3.	Removal of Hive Built-in Hash Functions and TestHiveFunctionRegistry	Resolved	Xiao Li

Activity

People

Assignee:: Unassigned

Reporter:: Reynold Xin

Votes:: 2 Vote for this issue

Watchers:: 39 Start watching this issue

Dates

Created:: 01/Jun/16 05:53

Updated:: 08/Oct/19 05:42

Resolved:: 08/Oct/19 05:42