Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Many connectors and formats require supporting external data types. Postgres users request UUID support, Avro users require enum support, etc.
FLINK-19869 implemented support for Postgres UUIDs poorly and even impacts performance with regular strings.
The long-term solution should be user-defined types in Flink. This is however a bigger effort that requires a FLIP and a bigger amount of resources.
As a mid-term solution, we should offer a consistent approach based on DDL options that allows to define a mapping from Flink type system to the external type system. I suggest the following:
CREATE TABLE MyTable ( ... ) WITH( 'mapping.data-types' = '<Flink field name>: <External field data type>' )
The mapping defines a map from Flink data type to external data type. The external data type should be string parsable. This works for most connectors and formats (e.g. Avro schema string).
Examples:
CREATE TABLE MyTable ( regular_col STRING, uuid_col STRING, point_col ARRAY<DOUBLE>, box_col ARRAY<ARRAY<DOUBLE>> ) WITH( 'mapping.data-types' = 'uuid_col: uuid, point_col: point, box_col: box' )
We provide a table of supported mapping data types. E.g. the point type is always maped to ARRAY<DOUBLE>. In general we choose a data type in Flink that comes closest to the required functionality.
Future work:
In theory, we can also offer mapping of field names. It might be a requirement that Flink's column name is different from the external system's one.
CREATE TABLE MyTable ( ... ) WITH( 'mapping.names' = '<Flink field name>: <External field name>' )
Attachments
Issue Links
- is related to
-
FLINK-30092 Improve Table API experience for Flink DOUBLE type
- Open