Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10086

Standby state isn't always re-used when transitioning to active

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.6.0
    • Component/s: streams
    • Labels:
      None

      Description

      This ticket was initially just to write an integration test, but I escalated it to a blocker and changed the title when the integration test actually surfaced two bugs:

      1. Offset positions were not reported for in-memory stores, so tasks with in-memory stores would never be considered as "caught up" and could not take over active processing, preventing clusters from ever achieving balance. This is a regression in 2.6
      2. When the TaskAssignor decided to switch active processing from a former owner to a new one that had a standby, the lower-level cooperative rebalance protocol would first de-schedule the task completely, and only later would assign it to the new owner. For in-memory stores, this causes the standby state not to be re-used, and for persistent stores, it creates a window in which the cleanup thread might delete the state directory. In both cases, even though the instance previously had a standby, once it gets the active, it still had to restore the entire state from the changelog.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vvcephei John Roesler
                Reporter:
                vvcephei John Roesler
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: