Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-24776

Clarify semantics of DecodingFormat and its data type

    XMLWordPrintableJSON

Details

    • Hide
      The DecodingFormat interface was used for both projectable and non-projectable formats which led to inconsistent implementations. The FileSystemTableSource has been updated to distinguish between those two interfaces now. Users that implement custom formats for FileSystemTableSource might need to verify the implementation and make sure to implement ProjectableDecodingFormat if necessary.
      Show
      The DecodingFormat interface was used for both projectable and non-projectable formats which led to inconsistent implementations. The FileSystemTableSource has been updated to distinguish between those two interfaces now. Users that implement custom formats for FileSystemTableSource might need to verify the implementation and make sure to implement ProjectableDecodingFormat if necessary.

    Description

      Today the org.apache.flink.table.connector.format.DecodingFormat interface has not clear requirements and it's confusing for implementers. In particular, it's unclear whether the format need to support projection push down or not, and whether the DataType provided to createRuntimeDecoder is projected and includes partition keys or not. An example of such misunderstanding is shown here: https://github.com/apache/flink/blob/991dd0466ff28995a22ded0727ef2a1706d9bddc/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/filesystem/FileSystemTableSource.java#L107

      The PR https://github.com/apache/flink/pull/17544 partially addresses the issue, because it removes the need from BulkFormat implementations to take care of partition keys handling. Neverthless, it's still unclear whether formats support projections or not and if they support nested projections.

      We should refactor DecodingFormat as follows:

      • Clarify DecodingFormat and introduce ProjectableDecodingFormat.
      • Introduce ProjectedRowData and Projection to simplify implementations of connectors that needs to deal with projections
      • Apply the changes to most of the formats and connectors we have.

      Attachments

        Issue Links

          Activity

            People

              slinkydeveloper Francesco Guardiani
              slinkydeveloper Francesco Guardiani
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: