Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22897

FlinkSQL1.12 Sink to Hive with diffrent parallelism will due to produce many small files

    XMLWordPrintableJSON

    Details

      Description

      I try to use flink sql in batch mode, to sink data into hive partition table, here is the sql:

       

      //代码占位符
      INSERT OVERWRTITE 【targetTable】SELECT 【field】FROM 【sourceTable】;
      

       

       

       

      And I found that when the parallelism of the sink operator is different from that of the operator before it, a large number of small files will be generated. But this is not the case when the parallelism is the same.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                zhengjiewen zhengjiewen
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated: