[HIVE-22062] WriteId is not updated for a partitioned ACID table when schema changes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
- ACID

Description

Changing the schema (e.g. adding a new column) of a non-partitioned ACID table results in the table-level writeId being incremented. This is as expected.

However, if you do the same on a partitioned ACID table then neither the table-level nor the partition-level writeIds are updated. I would expect in this case to increment the table-level writeId to reflect that the table has been changed.
Note, that get_valid_write_ids() shows that the high watermark is incremented even though the writeId isn't.

Update: I'd extend the scope of this Jira further a bit. There are a number of use cases in Hive that doesn't result in a writeId change on ACID tables and as a result there is no way from other systems (like Impala) to judge if a refresh should be run on a table or not. The only option is to every time update all the data for a table that is expensive. E.g. Additionally to the above use-case compaction is something that is not noticeable outside from Hive.

Attachments

Issue Links

Blocked

IMPALA-8809 Refresh a subset of partitions for ACID tables

Open

relates to

HIVE-22565 Make calling alter_table unnecessary during inserts into ACID tables

Open

Activity

People

Assignee:: Laszlo Kovari

Reporter:: Gabor Kaszab

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 30/Jul/19 14:22

Updated:: 03/Aug/20 18:06