Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16987

FLIP-95: Add new table source and sink interfaces

    XMLWordPrintableJSON

Details

    Description

      Proper support for handling changelogs, more efficient processing of data through the new Blink planner, and unified interfaces that are DataStream API agnostic make it necessary to rework the table source and sink interfaces.

      The goals of this FLIP are:

      • Simplify the current interface architecture:
        • Merge upsert, retract, and append sinks.
        • Unify batch and streaming sources.
        • Unify batch and streaming sinks.
      • Allow sources to produce a changelog:
        • UpsertTableSources have been requested a lot by users. Now is the time to open the internal planner capabilities via the new interfaces.
        • According to FLIP-105, we would like to support changelogs for processing formats such asĀ Debezium.
      • Don't rely on DataStream API for source and sinks:
        • According to FLIP-32, the Table API and SQL should be independent of the DataStream API which is why the `table-common` module has no dependencies on `flink-streaming-java`.
        • Source and sink implementations should only depend on the `table-common` module after FLIP-27.
        • Until FLIP-27 is ready, we still put most of the interfaces in `table-common` and strictly separate interfaces that communicate with a planner and actual runtime reader/writers.
      • Implement efficient sources and sinks without planner dependencies:
        • Make Blink's internal data structures available to connectors.
        • Introduce stable interfaces for data structures that can be marked as `@PublicEvolving`.

      Attachments

        Issue Links

          1.
          Add new data structure interfaces in table-common Sub-task Closed Jark Wu
          2.
          Add core table source/sink interfaces Sub-task Closed Timo Walther
          3.
          Add ability interfaces for table source/sink Sub-task Closed Timo Walther
          4.
          Add new factory interfaces and utilities Sub-task Closed Timo Walther  
          5.
          Refactor planner and connectors to use new data structures Sub-task Closed Jark Wu  
          6.
          Support ScanTableSource in planner Sub-task Closed Jark Wu  
          7.
          Support LookupTableSource in planner Sub-task Closed Jark Wu  
          8.
          Support DynamicTableSink in planner Sub-task Closed Jark Wu  
          9.
          Fix shortcomings in new data structures Sub-task Closed Timo Walther  
          10.
          Add a changeflag to Row type Sub-task Closed Timo Walther  
          11.
          Data structure should cover all conversions declared in logical types Sub-task Closed Timo Walther  
          12.
          Use new data structure converters when legacy types are not present Sub-task Closed Timo Walther  
          13.
          Type information in sources should cover all data structures Sub-task Closed Timo Walther  
          14.
          Refactor retraction rules to support inferring ChangelogMode Sub-task Closed Jark Wu
          15.
          Refactor BaseRow to use RowKind instead of byte header Sub-task Closed Jark Wu
          16.
          Support SupportsProjectionPushDown in planner Sub-task Closed godfrey he  
          17.
          Support SupportsFilterPushDown in planner Sub-task Closed Jacky Lau  
          18.
          Support SupportsLimitPushDown in planner Sub-task Closed Jacky Lau  
          19.
          Support SupportsPartitionPushDown in planner Sub-task Closed Shengkai Fang  
          20.
          Support SupportsOverwrite in planner Sub-task Closed Jark Wu  
          21.
          Support SupportsPartitioning in planner Sub-task Closed Jark Wu  
          22.
          Support SupportsComputedColumnPushDown in planner Sub-task Closed Unassigned  
          23.
          Support SupportsWatermarkPushDown in planner Sub-task Closed Unassigned  
          24.
          SinkFormat#createSinkFormat should use DynamicTableSink.Context as the first parameter Sub-task Closed Danny Chen  
          25.
          Improve new connector options exception in old planner Sub-task Closed Unassigned  
          26.
          Add createTypeInformation to DynamicTableSink#Context Sub-task Closed Jark Wu  
          27.
          Remove deprecated RowData.get and similar methods Sub-task Closed Unassigned  
          28.
          Improve interface of ScanFormatFactory and SinkFormatFactory Sub-task Closed Jark Wu  
          29.
          Add a table source example Sub-task Closed Timo Walther  
          30.
          Add documentation for how to develop a new table source/sink Sub-task Closed Timo Walther  
          31.
          Update the Row.toString method Sub-task Closed Timo Walther  
          32.
          Support to integrate FLIP-27 source interface with Table API Sub-task Closed Unassigned  
          33.
          Support to bridge Transformation (DataStream) with FLIP-95 interface? Sub-task Closed Unassigned  
          34.
          Support the SupportsProjectionPushDown interface for LookupTableSource Sub-task Closed Unassigned  
          35.
          Support the SupportsFilterPushDown interface for LookupTableSource Sub-task Open Yuan Zhu  
          36.
          Deprecate old source and sink interfaces Sub-task Closed Timo Walther  
          37.
          Support nested push down for SupportsProjectionPushDown interface in planner Sub-task Closed Shengkai Fang  
          38.
          Support the filter push down for the Jdbc connector Sub-task Closed zhuyufeng  
          39.
          Support the limit push down for the Jdbc connector Sub-task Closed Shengkai Fang  
          40.
          Support the limit push down for the hbase connector Sub-task Open Unassigned  
          41.
          Support the nested projection push down for the hbase connector Sub-task Open Unassigned  
          42.
          Refactor and merge SupportsComputedColumnPushDown and SupportsWatermarkPushDown interfaces Sub-task Closed Jark Wu  
          43.
          Support watermark push down with WatermarkStrategy Sub-task Closed Shengkai Fang  
          44.
          Supports nested projection pushdown for filesystem connector of columnar format Sub-task Open Unassigned  
          45.
          Support Watermark push down for kafka connector Sub-task Closed Shengkai Fang  
          46.
          Remove deprecated CatalogTable.getProperties Sub-task Closed Timo Walther  
          47.
          predicates can not be pushed into TableScan when there is a WatermarkAssigner between Filter and Scan Sub-task Closed Yuval Itzchakov  
          48.
          Add a SupportsSourceWatermark ability interface Sub-task Closed Timo Walther  
          49.
          Allow prefix syntax for ConfigOption.mapType Sub-task Closed Timo Walther  
          50.
          Consider ConfigOption fallback keys in FactoryUtil Sub-task Closed Timo Walther  
          51.
          Validate partition columns for ResolvedCatalogTable Sub-task Closed Timo Walther  
          52.
          Non-equality predicates on partition columns lead to incorrect plans Sub-task Closed Unassigned  
          53.
          Push down partitions before filters Sub-task Closed xuyang  
          54.
          Clarify semantics of filter, projection, partition, and metadata pushdown Sub-task Open Timo Walther  
          55.
          Add producedDataType to SupportsProjectionPushDown.applyProjection Sub-task Closed Francesco Guardiani  
          56.
          DynamicTableFactory.Context.getCatalogTable().getPartitionKeys() should return indexes Sub-task Open Unassigned  
          57.
          Metadata keys should not conflict with physical columns Sub-task Closed Timo Walther  

          Activity

            People

              Unassigned Unassigned
              twalthr Timo Walther
              Votes:
              1 Vote for this issue
              Watchers:
              29 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m