Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-12339

Add retry to admin client's listOffsets

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.5.2, 2.8.0, 2.7.1, 2.6.2
    • 2.5.2, 2.8.0, 2.7.1, 2.6.2
    • None
    • None

    Description

      After upgrading our connector env to 2.9.0-SNAPSHOT, sometimes the connect cluster encounters following error.

      Uncaught exception in herder work thread, exiting:  (org.apache.kafka.connect.runtime.distributed.DistributedHerder:324)

      org.apache.kafka.connect.errors.ConnectException: Error while getting end offsets for topic 'connect-storage-topic-connect-cluster-1'

      at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:689)

      at org.apache.kafka.connect.util.KafkaBasedLog.readToLogEnd(KafkaBasedLog.java:338)

      at org.apache.kafka.connect.util.KafkaBasedLog.start(KafkaBasedLog.java:195)

      at org.apache.kafka.connect.storage.KafkaStatusBackingStore.start(KafkaStatusBackingStore.java:216)

      at org.apache.kafka.connect.runtime.AbstractHerder.startServices(AbstractHerder.java:129)

      at org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:310)

      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

      at java.base/java.lang.Thread.run(Thread.java:834)

      Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.

      at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)

      at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)

      at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)

      at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)

      at org.apache.kafka.connect.util.TopicAdmin.endOffsets(TopicAdmin.java:668)

      ... 10 more

      https://github.com/apache/kafka/pull/9780 added shared admin to get end offsets. KafkaAdmin#listOffsets does not handle topic-level error, hence the UnknownTopicOrPartitionException on topic-level can obstruct worker from running when the new internal topic is NOT synced to all brokers.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            chia7712 Chia-Ping Tsai
            chia7712 Chia-Ping Tsai
            Randall Hauch Randall Hauch
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment