Uploaded image for project: 'Apache Curator'
  1. Apache Curator
  2. CURATOR-45

LeaderSelector threw exception, but still created ephemeral node, breaking everything

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.0-incubating
    • Fix Version/s: 2.3.0
    • Component/s: Framework, Recipes
    • Labels:
      None

      Description

      ZooKeeper hiccupped, and then this happened:

      2013-06-19 02:23:35,561 DEBUG [LeaderSelector-1] com.netflix.curator.RetryLoop.takeException (RetryLoop.java:184) - Retry-able exception received
      org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /[REMOVED]/election/_c_1ccdb2b9-7f9a-4570-9555-201c91ec2dcb-lock-

      at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) ~[zookeeper-3.5.0.jar:3.5.0--1]

      at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[zookeeper-3.5.0.jar:3.5.0--1]

      at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:876) ~[zookeeper-3.5.0.jar:3.5.0--1]

      at com.netflix.curator.framework.imps.CreateBuilderImpl$10.call(CreateBuilderImpl.java:625) ~[curator-framework-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.imps.CreateBuilderImpl$10.call(CreateBuilderImpl.java:609) ~[curator-framework-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:106) [curator-client-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:605) [curator-framework-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:428) [curator-framework-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:41) [curator-framework-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:218) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:218) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:74) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:314) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:373) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:46) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at com.netflix.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:195) [curator-recipes-1.3.5-SNAPSHOT.jar:?]

      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [?:1.6.0_27]

      at java.util.concurrent.FutureTask.run(FutureTask.java:166) [?:1.6.0_27]

      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) [?:1.6.0_27]

      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.6.0_27]

      at java.lang.Thread.run(Thread.java:679) [?:1.6.0_27]

      However, the ephemeral node got created, and this hung leader election for this path.

      I'm investigating to work out where to put an extra guaranteed-delete. I see the case in LockInternals, which sometimes triggers to do this cleanup, but it didn't trigger in this case.

      You must really love our bugs by now.

        Attachments

        1. CURATOR-45.patch
          25 kB
          Michael Morello

          Issue Links

            Activity

              People

              • Assignee:
                randgalt Jordan Zimmerman
                Reporter:
                arren Shevek
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: