Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7758

When Naming a Repartition Topic with Aggregations Reuse Repartition Graph Node for Multiple Operations

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • streams
    • None

    Description

      When performing aggregations that require repartitioning and the repartition topic name is specified, and using the resulting KGroupedStream for multiple operations i.e.

       

      final KGroupedStream<String, String> kGroupedStream = builder.<String, String>stream("topic").selectKey((k, v) -> k).groupByKey(Grouped.as("grouping"));
      kGroupedStream.windowedBy(TimeWindows.of(Duration.ofMillis(10L))).count();
      kGroupedStream.windowedBy(TimeWindows.of(Duration.ofMillis(30L))).count();
      
      

      If optimizations aren't enabled, Streams will attempt to build two repartition topics of the same name resulting in a failure creating the topology.  

       

      However, we have enough information to re-use the existing repartition node via graph nodes used for building the intermediate representation of the topology. This ticket will make the 

      behavior of reusing a KGroupedStream consistent regardless if optimizations are turned on or not.

      Attachments

        Issue Links

          Activity

            People

              bbejeck Bill Bejeck
              bbejeck Bill Bejeck
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: