Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3179

Combiner is not injected if Reduce or GroupReduce input is explicitly partitioned

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.10.1
    • 1.0.1, 1.1.0
    • API / DataSet
    • None

    Description

      The optimizer does not inject a combiner if the input of a Reducer or GroupReducer is explicitly partitioned as in the following example

      DataSet<Tuple2<String,Integer>> words = ...
      DataSet<Tuple2<String,Integer>> counts = words
        .partitionByHash(0)
        .groupBy(0)
        .sum(1);
      

      Explicit partitioning can be useful to enforce partitioning on a subset of keys or to use a different partitioning method (custom or range partitioning).

      This issue should be fixed by changing the instantiate() methods of the ReduceProperties and GroupReduceWithCombineProperties classes such that a combine is injected in front of a PartitionPlanNode if it is the input of a Reduce or GroupReduce operator. This should only happen, if the Reducer is the only successor of the Partition operator.

      Attachments

        Activity

          People

            ram_krish ramkrishna.s.vasudevan
            fhueske Fabian Hueske
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: