Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7758

When Naming a Repartition Topic with Aggregations Reuse Repartition Graph Node for Multiple Operations

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1.0
    • Fix Version/s: 2.2.0
    • Component/s: streams
    • Labels:
      None

      Description

      When performing aggregations that require repartitioning and the repartition topic name is specified, and using the resulting KGroupedStream for multiple operations i.e.

       

      final KGroupedStream<String, String> kGroupedStream = builder.<String, String>stream("topic").selectKey((k, v) -> k).groupByKey(Grouped.as("grouping"));
      kGroupedStream.windowedBy(TimeWindows.of(Duration.ofMillis(10L))).count();
      kGroupedStream.windowedBy(TimeWindows.of(Duration.ofMillis(30L))).count();
      
      

      If optimizations aren't enabled, Streams will attempt to build two repartition topics of the same name resulting in a failure creating the topology.  

       

      However, we have enough information to re-use the existing repartition node via graph nodes used for building the intermediate representation of the topology. This ticket will make the 

      behavior of reusing a KGroupedStream consistent regardless if optimizations are turned on or not.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bbejeck Bill Bejeck
                Reporter:
                bbejeck Bill Bejeck
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: