[HUDI-6788] Integrate FileGroupReader with MergeOnReadInputFormat for Flink - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: In Progress
Priority: Blocker
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: 1.1.0
Component/s: None
Labels:
None

Story Points:
5
Epic Link:
1.X Api & Abstractions

Description

The existing MergeOnReadInputFormat implements different iterators for all kinds of read more: incremental read, read optimized view, snapshot view etc. While for better performance and code evolving, we can integrate the new FileGroupReader, the main difference is that the FileGroupReader capsulate the file slice logs and parquet merging logic, so each iterator can ease the redundant work for quering the fs view and comprising the file slices.

We can integrate step by step for different read views: 1. snapshot queries 2. read optimized queries 3. skip merge queries

For usability and smoth evolving, we should add a flag for the new reader, the old code path should be kept there for 1 or 2 releases.

The major work AIs includes:

1. implement the HoodieFlinkRecord akka to the HoodieSparkRecord;
2. implement the Flink specific FileGroupReader with the HoodieFlinkRecord;

3. Flink implements the snapshot queries using the file group reader;

4. Flink implements the read optimized queries using the file group reader;

5. Flink implements the skip merge queries using the file group reader.

Attachments

Sub-Tasks

1.	Implement the HoodieFlinkRecord akka to the HoodieSparkRecord	Open	Zhenqiu Huang
2.	Implement the Flink specific FileGroupReader with the HoodieFlinkRecord	Open	Zhenqiu Huang
3.	Flink implements the snapshot queries using the file group reader	Open	Zhenqiu Huang
4.	Flink implements the read optimized queries using the file group reader	Open	Zhenqiu Huang
5.	Flink implements the skip merge queries using the file group reader	Open	Zhenqiu Huang

Activity

People

Assignee:: Zhenqiu Huang

Reporter:: Ethan Guo (this is the old account; please use "yihua")

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 29/Aug/23 23:14

Updated:: 06/Sep/24 17:40