Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-12339

Add retry to admin client's listOffsets

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.5.2, 2.8.0, 2.7.1, 2.6.2
    • 2.5.2, 2.8.0, 2.7.1, 2.6.2
    • None
    • None

    Description

      After upgrading our connector env to 2.9.0-SNAPSHOT, sometimes the connect cluster encounters following error.

      Uncaught exception in herder work thread, exiting:  (org.apache.kafka.connect.runtime.distributed.DistributedHerder:324)

      org.apache.kafka.connect.errors.ConnectException: Error while getting end offsets for topic 'connect-storage-topic-connect-cluster-1'

      at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:689)

      at org.apache.kafka.connect.util.KafkaBasedLog.readToLogEnd(KafkaBasedLog.java:338)

      at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:195)

      at org.apache.kafka.connect.storage.KafkaStatusBackingStore.start(KafkaStatusBackingStore.java:216)

      at org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:129)

      at org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:310)

      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

      at java.base/java.lang.Thread.run(Thread.java:834)

      Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.

      at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)

      at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)

      at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)

      at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)

      at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:668)

      ... 10 more

      https://github.com/apache/kafka/pull/9780 added shared admin to get end offsets. KafkaAdmin#listOffsets does not handle topic-level error, hence the UnknownTopicOrPartitionException on topic-level can obstruct worker from running when the new internal topic is NOT synced to all brokers.

      Attachments

        Issue Links

          Activity

            People

              chia7712 Chia-Ping Tsai
              chia7712 Chia-Ping Tsai
              Randall Hauch Randall Hauch
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: