[KAFKA-989] Race condition shutting down high-level consumer results in spinning background thread - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.8.0
Fix Version/s: 0.8.0
Component/s: None
Labels:
None
Environment:
Ubuntu Linux x64

Description

Running an application that uses the Kafka client under load, can often hit this issue within a few hours.

High-level consumers come and go over this application's lifecycle, but there are a variety of defenses that ensure each high-level consumer lasts several seconds before being shutdown. Nevertheless, some race is causing this background thread to continue long after the ZKClient it is using has been disconnected. Since the thread was spawned by a consumer that has already been shutdown, the application has no way to find this thread and stop it.

Reported on the users-kafka mailing list 6/25 as "0.8 throwing exception 'Failed to find leader' and high-level consumer fails to make progress".

The only remedy is to shutdown the application and restart it. Externally detecting that this state has occurred is not pleasant: need to grep log for repeated occurrences of the same exception.

Stack trace:

Failed to find leader for Set([topic6,0]): java.lang.NullPointerException
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416)
at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413)
at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)
at kafka.utils.ZkUtils$.getChildrenParentMayNotExist(ZkUtils.scala:438)
at kafka.utils.ZkUtils$.getAllBrokersInCluster(ZkUtils.scala:75)
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:63)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

KAFKA-989-failed-to-find-leader-patch3.patch
05/Aug/13 19:13
5 kB
Phil Hargett
KAFKA-989-failed-to-find-leader-patch2.patch
05/Aug/13 14:27
3 kB
Phil Hargett
KAFKA-989-failed-to-find-leader.patch
02/Aug/13 17:32
1 kB
Phil Hargett

Activity

People

Assignee:: Phil Hargett

Reporter:: Phil Hargett

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 26/Jul/13 16:50

Updated:: 06/Aug/13 14:13

Resolved:: 06/Aug/13 02:04