Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22897

FlinkSQL1.12 Sink to Hive with diffrent parallelism will due to produce many small files

    XMLWordPrintableJSON

Details

    Description

      I try to use flink sql in batch mode, to sink data into hive partition table, here is the sql:

       

      //代码占位符
      INSERT OVERWRTITE 【targetTable】SELECT 【field】FROM 【sourceTable】;
      

       

       

       

      And I found that when the parallelism of the sink operator is different from that of the operator before it, a large number of small files will be generated. But this is not the case when the parallelism is the same.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zhengjiewen zhengjiewen
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: