Uploaded image for project: 'Apache Curator'
  1. Apache Curator
  2. CURATOR-439

CuratorFrameworkState STARTED, but ZookeeperClient not connected

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.2.1
    • None
    • Documentation
    • None

    Description

      I recently ran into an issue on some of our nodes caused by network issues between a service and Zookeeper. I have been unable to recreate them as of yet, but I'm still trying.

      Setup
      5x services using Curator 3.2.1 to talk to Zookeeper 3.5.3 cluster (also 5 nodes).

      Network issues caused the services to disconnect from Zookeeper.

      There's a check in our code to see if the Zookeeper connection is available before sending a request:

      public boolean isConnected()

      Unknown macro: { return curatorFramework.getZookeeperClient().isConnected(); }

      After the network issues resolved, we noticed that all calls to Zookeeper from 4 of the services were still failing (the fifth was fine). Checking the logs, we saw that CuratorFramework.getState() was reporting the state as STARTED, but curatorFramework.getZookeeperClient().isConnected(); was returning false. Restarting the service fixed everything, but I want to obviously avoid this issue in future.

      Problem
      I couldn't find any documentation stating whether the CuratorZookeeperClient.isConnected() should be used, or if CuratorFramework.getState() == CuratorFrameworkState.STARTED (the functionality of the deprecated CuratorFramework.isConnected()) would be the better check, or if these should both be equivalent, and there's a bug that let one be true while the other was false.

      If my own check is wrong, and I shouldn't be using CuratorZookeeperClient.isConnected(), then I can easily fix that. I wanted to check the expected behaviour before diving too deep into this, in case this is normal and I am just using Curator incorrectly.

      Edit

      This was a misunderstanding on my part. I'm leaving it open so that I can submit a documentation/example update shortly to hopefully clarify things a bit better for others.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Kenco Alex Rankin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: