[FLINK-15206] support dynamic catalog table for truly unified SQL job - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Not a Priority
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Table SQL / API
Labels:

Description

currently if users have both an online and an offline job with same business logic in Flink SQL, their codebase is still not unified. They would keep two SQL statements whose only difference is the source (or/and sink) table (with different params). E.g.

// online job
insert into x select * from kafka_table (starting time) ...;

// offline backfill job
insert into x select * from hive_table  (starting and ending time) ...;

We can introduce a "dynamic catalog table". The dynamic catalog table acts as a view, and is just an abstract table of multiple actual tables behind it that can be switched under some configuration flags. When execute a job, depending on the configuration, the dynamic catalog table can point to an actual source table.

A use case for this is the example given above - when executed in streaming mode, my_source_dynamic_table should point to a kafka catalog table with a new starting position, and in batch mode, my_source_dynamic_table should point to a hive catalog table with starting/ending positions.

One thing to note is that the starting position of kafka_table, and starting/ending position of hive_table are different every time. needs more thinking of how can we accommodate that

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Bowen Li

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 12/Dec/19 00:47

Updated:: 19/Dec/21 22:38