Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Done
-
None
-
None
Description
Currently the TableAPI uses a static object called TranslationContext which holds the Calcite table catalog and a Calcite planner instance. Whenever a DataSet or DataStream is converted into a Table or registered as a Table on the TableEnvironment, a new entry is added to the catalog. The first time a Table is added, a planner instance is created. The planner is used to optimize the query (defined by one or more Table API operations and/or one ore more SQL queries) when a Table is converted into a DataSet or DataStream. Since a planner may only be used to optimize a single program, the choice of a single static object is problematic.
I propose to refactor the TableEnvironment to take over the responsibility of holding the catalog and the planner instance.
- A TableEnvironment holds a catalog of registered tables and a single planner instance.
- A TableEnvironment will only allow to translate a single Table (possibly composed of several Table API operations and SQL queries) into a DataSet or DataStream.
- A TableEnvironment is bound to an ExecutionEnvironment or a StreamExecutionEnvironment. This is necessary to create data source or source functions to read external tables or streams.
- DataSet and DataStream need a reference to a TableEnvironment to be converted into a Table. This will prohibit implicit casts as currently supported for the DataSet Scala API.
- A Table needs a reference to the TableEnvironment it is bound to. Only tables from the same TableEnvironment can be processed together.
- The TranslationContext will be completely removed.
Attachments
Attachments
Issue Links
- is related to
-
FLINK-3754 Add a validation phase before construct RelNode using TableAPI
- Closed