Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4441

Kafka Monitoring is incorrect during rapid topic creation and deletion

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.0.0, 0.10.0.1
    • 0.10.2.0
    • None
    • None

    Description

      Kafka reports several metrics off the state of partitions:
      UnderReplicatedPartitions
      PreferredReplicaImbalanceCount
      OfflinePartitionsCount

      All of these metrics trigger when rapidly creating and deleting topics in a tight loop, although the actual causes of the metrics firing are from topics that are undergoing creation/deletion, and the cluster is otherwise stable.

      Looking through the source code, topic deletion goes through an asynchronous state machine: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/TopicDeletionManager.scala#L35.

      However, the metrics do not know about the progress of this state machine: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/KafkaController.scala#L185

      I believe the fix to this is relatively simple - we need to make the metrics know that a topic is currently undergoing deletion or creation, and only include topics that are "stable"

      Attachments

        Issue Links

          Activity

            People

              ecomar Edoardo Comar
              tcrayford-heroku Tom Crayford
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: