Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-35293

FLIP-445: Support dynamic parallelism inference for HiveSource

    XMLWordPrintableJSON

Details

    • Hide
      In Flink 1.20, we have introduced support for dynamic source parallelism inference in batch jobs for the Hive source connector. This allows the connector to dynamically determine parallelism based on the actual partitions with dynamic partition pruning.
      Additionally, we have introduced a new configuration option, 'table.exec.hive.infer-source-parallelism.mode,' to enable users to choose between static and dynamic inference modes for source parallelism. By default, the mode is set to 'dynamic'. Users may configure it to 'static' for static inference, 'dynamic' for dynamic inference, or 'none' to disable automatic parallelism inference altogether. It should be noted that in Flink 1.20, the previous configration option 'table.exec.hive.infer-source-parallelism' has been marked as deprecated, but it will continue to serve as a switch for automatic parallelism inference until it is fully phased out.
      Show
      In Flink 1.20, we have introduced support for dynamic source parallelism inference in batch jobs for the Hive source connector. This allows the connector to dynamically determine parallelism based on the actual partitions with dynamic partition pruning. Additionally, we have introduced a new configuration option, 'table.exec.hive.infer-source-parallelism.mode,' to enable users to choose between static and dynamic inference modes for source parallelism. By default, the mode is set to 'dynamic'. Users may configure it to 'static' for static inference, 'dynamic' for dynamic inference, or 'none' to disable automatic parallelism inference altogether. It should be noted that in Flink 1.20, the previous configration option 'table.exec.hive.infer-source-parallelism' has been marked as deprecated, but it will continue to serve as a switch for automatic parallelism inference until it is fully phased out.

    Description

      FLIP-379 introduces dynamic source parallelism inference, which, compared to static inference, utilizes runtime information to more accurately determine the source parallelism. The FileSource already possesses the capability for dynamic parallelism inference. As a follow-up task to FLIP-379, this FLIP plans to implement the dynamic parallelism inference interface for HiveSource, and also switches the default static parallelism inference to dynamic parallelism inference.

      Attachments

        Issue Links

          Activity

            People

              xiasun xingbe
              xiasun xingbe
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: