[IMPALA-9484] Milestone 1: properly scan files that has full ACID schema - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Impala 4.0.0
Component/s: None
Labels:
- impala-acid

Epic Color:
ghx-label-3

Description

Full ACID row format looks like this:

{ "operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, "currentTransaction": 1, "row": \{"i": 1}

}

User columns are nested under "row". The frontend should create proper tuples and slot descriptors for the scan nodes to read the files correctly.

We should be able to query the ACID columns, at least for debugging/testing. Hive uses the special “row__id” identifier for that.

Impala should raise an error if there are delete deltas. Directory filtering should filter out minor compacted directories since the records from those need validation.

Non-goals in this sub-task:

row validation against validWriteIdList
reading "original files" (files in non-ACID format)
reading delete deltas

Attachments

Activity

People

Assignee:: Zoltán Borók-Nagy

Reporter:: Zoltán Borók-Nagy

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 10/Mar/20 18:46

Updated:: 05/May/20 07:31

Resolved:: 02/Apr/20 13:42