Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-34926

Adaptive auto parallelism doesn't work for a query

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.18.1
    • None
    • Table SQL / Planner
    • None

    Description

      We have the following query running in batch mode.

      WITH FEATURE_INCLUSION AS (
          SELECT
              insertion_id, -- Not unique
              features -- Array<Row<key, value>>
          FROM
              features_table
      ),
      TOTAL AS (
          SELECT
              COUNT(DISTINCT insertion_id) total_id
          FROM
              FEATURE_INCLUSION
      ),
      FEATURE_INCLUSION_COUNTS AS (
          SELECT
              `key`,
              COUNT(DISTINCT insertion_id) AS id_count
          FROM
              FEATURE_INCLUSION,
              UNNEST(features) as t (`key`, `value`)
          WHERE
              TRUE
          GROUP BY
              `key`
      ),
      RESULTS AS (
          SELECT
              `key`
          FROM
              FEATURE_INCLUSION_COUNTS,
              TOTAL
          WHERE
             (1.0 * id_count)/total_id > 0.1
      )
      SELECT
          JSON_ARRAYAGG(`key`) AS feature_ids,
      FROM
          RESULTS

      The parallelism adaptively set by Flink for the following operator was always 1.

      [37]:HashAggregate(isMerge=[true], groupBy=[key, insertion_id], select=[key, insertion_id])
      +- [38]:LocalHashAggregate(groupBy=[key], select=[key, Partial_COUNT(insertion_id) AS count$0])

      If we turn off `execution.batch.adaptive.auto-parallelism.enabled` and manually set `parallelism.default` to be greater than one, it worked.

      The screenshot of the full job graph is attached.

      Attachments

        1. image.png
          155 kB
          Xingcan Cui

        Activity

          People

            Unassigned Unassigned
            xccui Xingcan Cui
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: