[SPARK-32709] Write Hive ORC/Parquet bucketed table with hivehash (for Hive 1,2) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 3.1.0
Fix Version/s: 3.3.0
Component/s: SQL
Labels:
None

Description

Hive ORC/Parquet write code path is same as data source v1 code path (FileFormatWriter). This JIRA is to add the support to write Hive ORC/Parquet bucketed table with hivehash. The change is to custom `bucketIdExpression` to use hivehash when the table is Hive bucketed table, and the Hive version is 1.x.y or 2.x.y.

This will allow us write Hive/Presto-compatible bucketed table for Hive 1 and 2.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

91275701_stage6_metrics.png
09/Aug/21 23:49
395 kB
Shashank Pedamallu

Issue Links

links to

[Github] Pull Request #30003 (c21)

[Github] Pull Request #33432 (c21)

Activity

People

Assignee:: Cheng Su

Reporter:: Cheng Su

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 27/Aug/20 06:03

Updated:: 17/Sep/21 06:29

Resolved:: 17/Sep/21 06:29