Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.11.4, 1.14.0, 1.12.5, 1.13.3
Description
I tried to write a large avro file into HDFS and discover that the displayed file size in HDFS is extremely small, but copying that file to local yields the correct size. If we create another Flink job and read that avro file from HDFS, the job will finish without outputting any record because the file size Flink gets from HDFS is the very small file size.
This is because the output format created in FileSystemTableSink#createBulkWriterOutputFormat only finishes the BulkWriter. According to the java doc of BulkWriter#finish bulk writers should not close the output stream and should leave them to the framework.