Description
With a RetryNTimes retry policy and a disconnect. The event thread can get blocked on retries from the SharedValue watcher readValue, blocking other listeners from getting the SUSPENDED event till retry completes.
Seems the watcher should limit work and notifications to valid change events and ignore a disconnect. The ConnectionStateListener already handles those.
Sample thread stack that blocks other listeners:
main-EventThread" daemon prio=10 tid=0x00007f95d009f000 nid=0x3429 waiting on condition [0x00007f959d6d5000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at java.lang.Thread.sleep(Thread.java:340) at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360) at org.apache.curator.RetryLoop$1.sleepFor(RetryLoop.java:74) at org.apache.curator.retry.SleepingRetry.allowRetry(SleepingRetry.java:46) at org.apache.curator.retry.RetryNTimes.allowRetry(RetryNTimes.java:24) at org.apache.curator.RetryLoop.takeException(RetryLoop.java:188) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:112) at org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:287) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:41) at org.apache.curator.framework.recipes.shared.SharedValue.readValue(SharedValue.java:192) - locked <0x000000074326fb50> (a org.apache.curator.framework.recipes.shared.SharedValue) at org.apache.curator.framework.recipes.shared.SharedValue.access$100(SharedValue.java:42) at org.apache.curator.framework.recipes.shared.SharedValue$1.process(SharedValue.java:58) at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:67) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
Attachments
Issue Links
- links to