[SPARK-36227] Remove TimestampNTZ type support in Spark 3.2 - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.3.0
Fix Version/s: 3.3.0
Component/s: SQL
Labels:
None

Description

As of now, there are some blockers for delivering the TimestampNTZ project in Spark 3.2:

In the Hive Thrift server, both TimestampType and TimestampNTZType are mapped to the same timestamp type, which can cause confusion for users.
For the Parquet data source, the new written TimestampNTZType Parquet columns will be read as TimestampType in old Spark releases. Also, we need to decide the merge schema for files mixed with TimestampType and TimestampNTZ type.
The type coercion rules for TimestampNTZType are incomplete. For example, what should the data type of the in clause "IN(Timestamp'2020-01-01 00:00:00', TimestampNtz'2020-01-01 00:00:00') be.
It is tricky to support TimestampNTZType in JSON/CSV data readers. We need to avoid regressions as possible as we can.

There are 10 days left for the expected 3.2 RC date. So, I propose to release the TimestampNTZ type in Spark 3.3 instead of Spark 3.2. So that we have enough time to make considerate designs for the issues.

Attachments

Issue Links

links to

[Github] Pull Request #33444 (gengliangwang)

[Github] Pull Request #33837 (gengliangwang)

[Github] Pull Request #33851 (gengliangwang)

Activity

People

Assignee:: Gengliang Wang

Reporter:: Gengliang Wang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 20/Jul/21 16:20

Updated:: 26/Aug/21 13:43

Resolved:: 21/Jul/21 16:55