Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2841

Group metadata cache loading is not safe when reloading a partition

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.9.0.0
    • Fix Version/s: 0.9.0.0
    • Component/s: None
    • Labels:
      None

      Description

      If the coordinator receives a leaderAndIsr request which includes a higher leader epoch for one of the partitions that it owns, then it will reload the offset/metadata for that partition again. This can happen because the leader epoch is incremented for ISR changes which do not result in a new leader for the partition. Currently, the coordinator replaces cached metadata values blindly on reloading, which can result in weird behavior such as unexpected session timeouts or request timeouts while rebalancing.

      To fix this, we need to check that the group being loaded has a higher generation than the cached value before replacing it. Also, if we have to replace a cached value (which shouldn't happen except when loading), we need to be very careful to ensure that any active delayed operations won't affect the group.

        Attachments

          Activity

            People

            • Assignee:
              hachikuji Jason Gustafson
              Reporter:
              hachikuji Jason Gustafson
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: