Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10086

Standby state isn't always re-used when transitioning to active

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.6.0
    • 2.6.0
    • streams
    • None

    Description

      This ticket was initially just to write an integration test, but I escalated it to a blocker and changed the title when the integration test actually surfaced two bugs:

      1. Offset positions were not reported for in-memory stores, so tasks with in-memory stores would never be considered as "caught up" and could not take over active processing, preventing clusters from ever achieving balance. This is a regression in 2.6
      2. When the TaskAssignor decided to switch active processing from a former owner to a new one that had a standby, the lower-level cooperative rebalance protocol would first de-schedule the task completely, and only later would assign it to the new owner. For in-memory stores, this causes the standby state not to be re-used, and for persistent stores, it creates a window in which the cleanup thread might delete the state directory. In both cases, even though the instance previously had a standby, once it gets the active, it still had to restore the entire state from the changelog.

      Attachments

        Issue Links

          Activity

            People

              vvcephei John Roesler
              vvcephei John Roesler
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: