Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-12775

StreamsPartitionAssignor / ClientState throws an exception when a new Task gets added to a KStreams Application in a Backwards-Compatible Topology Change

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • None
    • None
    • streams
    • None

    Description

      KAFKA-6145 and KAFKA-10079 added an exception if the Partition Assignor tries to look up the lag for a TaskId that seemingly does not exist.

       

      I believe this is a functional regression.

      Before, it was possible for Streams users to make backwards-compatible topology changes and roll those out, without having to do a complete restore or reload.

      For example:

      Existing sample topology:
       

      stream1 = stream(topic)
      
      stream1
        .map(...)
        .to(output)

      And doing this backwards-compatible change:

      stream1 = stream(topic)
      ++ table = stream(topic2).through(repartition-topic)/repartition().toTable()
      
      stream1
        .map(...)
      ++  .join(table)
        .to(output)

       

      This effectively creates a new subtopology with a new task for the table repartition.

      In older KStreams versions, it would have been possible to simply roll this change out.
      Since 2.6, rolling this out will crash the stream because the linked exception gets thrown when StreamsPartitionAssignor#getPreviousTasksByLag tries to look up the lag for the new table-repartition-task

       

      At this time, the only possible way to avoid this exception seems to be deleting all local state and doing a complete restore with the new topology change included.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            nhab Nico Habermann
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: