Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-12339

Add retry to admin client's listOffsets

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.5.2, 2.8.0, 2.7.1, 2.6.2
    • Fix Version/s: 2.5.2, 2.8.0, 2.7.1, 2.6.2
    • Component/s: None
    • Labels:
      None

      Description

      After upgrading our connector env to 2.9.0-SNAPSHOT, sometimes the connect cluster encounters following error.

      Uncaught exception in herder work thread, exiting:  (org.apache.kafka.connect.runtime.distributed.DistributedHerder:324)

      org.apache.kafka.connect.errors.ConnectException: Error while getting end offsets for topic 'connect-storage-topic-connect-cluster-1'

      at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:689)

      at org.apache.kafka.connect.util.KafkaBasedLog.readToLogEnd(KafkaBasedLog.java:338)

      at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:195)

      at org.apache.kafka.connect.storage.KafkaStatusBackingStore.start(KafkaStatusBackingStore.java:216)

      at org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:129)

      at org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:310)

      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

      at java.base/java.lang.Thread.run(Thread.java:834)

      Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.

      at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)

      at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)

      at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)

      at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)

      at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:668)

      ... 10 more

      https://github.com/apache/kafka/pull/9780 added shared admin to get end offsets. KafkaAdmin#listOffsets does not handle topic-level error, hence the UnknownTopicOrPartitionException on topic-level can obstruct worker from running when the new internal topic is NOT synced to all brokers.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                chia7712 Chia-Ping Tsai
                Reporter:
                chia7712 Chia-Ping Tsai
                Reviewer:
                Randall Hauch
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: