Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21389

ParquetInputFormat should not need parquet schema as user input

    XMLWordPrintableJSON

    Details

      Description

      ParquetInputFormat takes parquet schema as user input but after split it reads the parquet schema again here https://github.com/apache/flink/blob/52dcf439bb0b8d613fff1efecf015052d5b3a10b/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetInputFormat.java#L170
      it should read the provided user schema.
      But better would be to read the schema automatically and not require the user to provide a schema as spark does (https://spark.apache.org/docs/latest/sql-data-sources-parquet.html).
      Thus we could add a ParquetInputFormat constructor and allow ParquetTableSource with no schema parameter

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                echauchot Etienne Chauchot
                Reporter:
                echauchot Etienne Chauchot
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: