[IMPALA-10732] Use consistent DDL for specifying Iceberg partitions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Impala 4.1.0
Component/s: None
Labels:
- impala-iceberg

Epic Link:
Iceberg support in Impala
Epic Color:
ghx-label-6

Description

Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)

PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)

STORED AS ICEBERG;
The same in Spark is:
CREATE TABLE ice_t (i int, s string, ts timestamp, d date)

USING ICEBERG

PARTITIONED BY (bucket(5, i), months(ts), years(d))

Impala's syntax is older but hasn't been released yet. Spark's syntax is released so it cannot be changed.

Hive is also working on DDL support for Iceberg partitions, and they are favoring the released SparkSQL syntax. See ~~HIVE-25179~~

After dicsussing it on dev@impala we decided to use SparkSQL's syntax.

Attachments

Activity

People

Assignee:: Zoltán Borók-Nagy

Reporter:: Zoltán Borók-Nagy

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 07/Jun/21 15:25

Updated:: 05/Aug/21 16:11

Resolved:: 05/Aug/21 16:11