Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7669

Stream topology definition is not robust to the ordering changes

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.0
    • None
    • streams
    • None

    Description

      It seems that if the user does not guarantee the order of the stream topology definition, he may end up with multiple stream branches having the same internal changelog (and repartition, if created) topic.

      Let's assume:

      val initialStream = new StreamsBuilder().stream(sth);
      val someStrings = (1 to 10).map(_.toString)
      val notGuaranteedOrderOfStreams: Map[String, KStream[...]] = someStrings.map(s => s -> initialStream.filter(...)).toMap

      When the user defines now common aggregation logic for the notGuaranteedOrderOfStreams, and runs multiple instances of the application the KSTREAM-AGGREGATE-STATE-STORE topics names will not be unique and will contain results of the different streams from notGuaranteedOrderOfStreams map.

      All of this without a single warning that the topology (or just the order of the topology definition) differs in different instances of the Kafka Streams application.

      Also, I am concerned that ids in "KSTREAM-AGGREGATE-STATE-STORE-id-changelog " match so well for the different application instances (and different topologies).

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nijo Mateusz Owczarek
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: