Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-33759

flink parquet writer support write nested array or map type

    XMLWordPrintableJSON

Details

    Description

      When we use flink-parquet format wirte Map<String, String>[] type (which will be read by spark job), we encounter an exception:

      // code placeholder
      Caused by: org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead
          at org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.endField(MessageColumnIO.java:329)
          at org.apache.flink.formats.parquet.row.ParquetRowDataWriter$ArrayWriter.writeArrayData(ParquetRowDataWriter.java:438)
          at org.apache.flink.formats.parquet.row.ParquetRowDataWriter$ArrayWriter.write(ParquetRowDataWriter.java:419)
          at org.apache.flink.formats.parquet.row.ParquetRowDataWriter$RowWriter.write(ParquetRowDataWriter.java:471)
          at org.apache.flink.formats.parquet.row.ParquetRowDataWriter.write(ParquetRowDataWriter.java:81)
          at org.apache.flink.formats.parquet.row.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:89)

      after review the code, we found flink-parquet doesn't support write nested array or map, because [ArrayWriter and MapWriter doesn't impl `public void write(ArrayData arrayData, int ordinal) {}` function.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              cailiuyang Cai Liuyang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: