Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-24728

Batch SQL file sink forgets to close the output stream

    XMLWordPrintableJSON

Details

    Description

      I tried to write a large avro file into HDFS and discover that the displayed file size in HDFS is extremely small, but copying that file to local yields the correct size. If we create another Flink job and read that avro file from HDFS, the job will finish without outputting any record because the file size Flink gets from HDFS is the very small file size.

      This is because the output format created in FileSystemTableSink#createBulkWriterOutputFormat only finishes the BulkWriter. According to the java doc of BulkWriter#finish bulk writers should not close the output stream and should leave them to the framework.

      Attachments

        Activity

          People

            TsReaper Caizhi Weng
            TsReaper Caizhi Weng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: