ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-710

permanent ZSESSIONMOVED error after client app reconnects to zookeeper cluster

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.2.2
    • Fix Version/s: 3.2.3, 3.3.0
    • Component/s: server
    • Labels:
      None
    • Environment:

      debian lenny; ia64; xen virtualization

    • Hadoop Flags:
      Reviewed

      Description

      Originally problem was described on Users mailing list starting with this post.
      Below I restate it in more organized form.

      We occasionally (few times a day) observe that our client application disconnects from Zookeeper cluster.
      Application is written in C++ and we are using libzookeeper_mt library. In version 3.2.2.

      The disconnects we are observing are probably related to some problems with our network infrastructure - we are observing periods with great packet loss between machines in our DC.

      Sometimes after client application (i.e. zookeeper library) reconnects to zookeeper cluster we are observing that all subsequent requests return ZSESSIONMOVED error. Restarting client app helps - we always pass 0 as clientid to zookeeper_init function so old session is not reused.

      On 16-03-2010 we observed few occurences of problem. Example ones:

      • 22:08; client IP 10.1.112.60 (app1); sessionID 0x22767e1c9630000
      • 14:21; client IP 10.1.112.61 (app2); sessionID 0x324dcc1ba580085

      I attach logs of cluster and application nodes (only stuff concerining zookeeper):

      I also made some analysis of case at 22:08:

      • Network glitch which resulted in problem occurred at about 22:08.
      • From what I see since 17:48 node2 was the leader and it did not
        change later yesterday.
      • Client was connected to node2 since 17:50
      • At around 22:09 client tried to connect to every node (1,2,3).
        Connections to node1 and node3 were closed
        with exception "Exception causing close of session 0x22767e1c9630000
        due to java.io.IOException: Read error".
        Connection to node2 stood alive.
      • All subsequent operations were refused with ZSESSIONMOVED error.
        Error visible both on client and on server side.
      1. zookeeper-node1.log.2010-03-16.gz
        178 kB
        Lukasz Osipiuk
      2. zookeeper-node2.log.2010-03-16.gz
        602 kB
        Lukasz Osipiuk
      3. zookeeper-node3.log.2010-03-16.gz
        215 kB
        Lukasz Osipiuk
      4. app1.log.2010-03-16.gz
        176 kB
        Lukasz Osipiuk
      5. app2.log.2010-03-16.gz
        12 kB
        Lukasz Osipiuk
      6. ZOOKEEPER-710_3.3.patch
        8 kB
        Patrick Hunt
      7. ZOOKEEPER-710_3.2.patch
        11 kB
        Patrick Hunt

        Activity

        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #763 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/763/)
        . Forgot to run ant docs

        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #763 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/763/ ) . Forgot to run ant docs
        Hide
        Mahadev konar added a comment -

        I just committed this to 3.2 branch. thanks pat!

        Show
        Mahadev konar added a comment - I just committed this to 3.2 branch. thanks pat!
        Hide
        Lukasz Osipiuk added a comment -

        I will build version with patch and deploy it on Monday. But as I will also tweak client timeout parameter and GC; and bug itself is not manually reproducable, it wont be best check if fix is working

        Show
        Lukasz Osipiuk added a comment - I will build version with patch and deploy it on Monday. But as I will also tweak client timeout parameter and GC; and bug itself is not manually reproducable, it wont be best check if fix is working
        Hide
        Patrick Hunt added a comment -

        Lukasz feel free to try the attached 3.2 patch. It passes all tests on my system. You should be able to patch/compile against the 3.2 release (patch -p0 < patchfile) or svn 3.2 branch.

        It would be great if you could try this out and verify that it fixes the issue you've been seeing with moved.

        Show
        Patrick Hunt added a comment - Lukasz feel free to try the attached 3.2 patch. It passes all tests on my system. You should be able to patch/compile against the 3.2 release (patch -p0 < patchfile) or svn 3.2 branch. It would be great if you could try this out and verify that it fixes the issue you've been seeing with moved.
        Hide
        Benjamin Reed added a comment -

        Committed revision 925104.

        Show
        Benjamin Reed added a comment - Committed revision 925104.
        Hide
        Patrick Hunt added a comment -

        this 3.2 patch is against the current 3.2 svn branch

        fixes the issue in my tests

        Show
        Patrick Hunt added a comment - this 3.2 patch is against the current 3.2 svn branch fixes the issue in my tests
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12439239/ZOOKEEPER-710_3.3.patch
        against trunk revision 924942.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/143/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/143/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/143/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12439239/ZOOKEEPER-710_3.3.patch against trunk revision 924942. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/143/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/143/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/143/console This message is automatically generated.
        Hide
        Benjamin Reed added a comment -

        +1 great work pat! and thanx Lukasz for identifying this failure condition.

        Show
        Benjamin Reed added a comment - +1 great work pat! and thanx Lukasz for identifying this failure condition.
        Hide
        Patrick Hunt added a comment -

        fixes this session moved issue by closing the invalid connection

        Show
        Patrick Hunt added a comment - fixes this session moved issue by closing the invalid connection
        Hide
        Patrick Hunt added a comment -

        Marking this to fix for 3.3.0 and 3.2.3

        Show
        Patrick Hunt added a comment - Marking this to fix for 3.3.0 and 3.2.3
        Hide
        Benjamin Reed added a comment -

        lukasz thank you for the logs! they have really helped pinpoint the problem. it appears that message delays are the cause of the problem. it appears that what is happening is that the client is connecting to a follower that is running very slow. the follower is slow processing the reconnect request and forwarding it to the leader. for example, the following shows a case where it takes 7 seconds for the follower to process a reestablish connection request.

        zookeeper-node3.log.2010-03-16:2010-03-16 14:21:41,267 - INFO [NIOServerCxn.Factory:2181:NIOServerCnxn@964] - Finished init of 0x224ea5675b4007b valid:true
        zookeeper-node3.log.2010-03-16:2010-03-16 14:21:48,937 - INFO [NIOServerCxn.Factory:2181:NIOServerCnxn@612] - Renewing session 0x224ea5675b4007b

        the problem is that within that 7 seconds a client can drop the connection to the follower and establish a connection to the leader. if the renew session request is delayed until after the client reestablishes the connection with the leader, the session will get marked as managed by the previous follower and all requests set to the leader will get MOVED error, which is what you are seeing.

        the problem is that we consider the MOVED error to be a request error rather than a connection error. the leader needs to drop the connection when it sees the MOVED error so that the client can reestablish the session properly with another machine.

        Show
        Benjamin Reed added a comment - lukasz thank you for the logs! they have really helped pinpoint the problem. it appears that message delays are the cause of the problem. it appears that what is happening is that the client is connecting to a follower that is running very slow. the follower is slow processing the reconnect request and forwarding it to the leader. for example, the following shows a case where it takes 7 seconds for the follower to process a reestablish connection request. zookeeper-node3.log.2010-03-16:2010-03-16 14:21:41,267 - INFO [NIOServerCxn.Factory:2181:NIOServerCnxn@964] - Finished init of 0x224ea5675b4007b valid:true zookeeper-node3.log.2010-03-16:2010-03-16 14:21:48,937 - INFO [NIOServerCxn.Factory:2181:NIOServerCnxn@612] - Renewing session 0x224ea5675b4007b the problem is that within that 7 seconds a client can drop the connection to the follower and establish a connection to the leader. if the renew session request is delayed until after the client reestablishes the connection with the leader, the session will get marked as managed by the previous follower and all requests set to the leader will get MOVED error, which is what you are seeing. the problem is that we consider the MOVED error to be a request error rather than a connection error. the leader needs to drop the connection when it sees the MOVED error so that the client can reestablish the session properly with another machine.
        Hide
        Lukasz Osipiuk added a comment -

        This is application stuff.

        Lets take exist function from our code which wraps zookeeper library as an example.
        Its declaration is:

        ZooKeeper::ErrorCode::Type ZooKeeper::exists(const std::string& path,
                                                     Stat &stat,
                                                     const WatchFunction& watchFunction,
                                                     WatchId& watchId,
                                                     apr_time_t timeout) 
        

        As you see caller passes high level watchFunction to be called.
        Our wrapper maintains a map (node,operationtype)->watcherFunction. Internal watch function which is used in zoo_* calls
        is responsible for calling user provided watchFunctions stored in map in case of change concerning one of nodes.

        When client disconnects from zookeeper all watches are invalidated (at least AFAIK) so we are calling all user provided functions in such case.
        We are doing this as soon client is reconnected to zookeeper. We simulate situation when change occured to all nodes application was interested in. That is what fireAllWatchFunctions function do.

        Is it somewhat clear?

        Show
        Lukasz Osipiuk added a comment - This is application stuff. Lets take exist function from our code which wraps zookeeper library as an example. Its declaration is: ZooKeeper::ErrorCode::Type ZooKeeper::exists(const std::string& path, Stat &stat, const WatchFunction& watchFunction, WatchId& watchId, apr_time_t timeout) As you see caller passes high level watchFunction to be called. Our wrapper maintains a map (node,operationtype)->watcherFunction. Internal watch function which is used in zoo_* calls is responsible for calling user provided watchFunctions stored in map in case of change concerning one of nodes. When client disconnects from zookeeper all watches are invalidated (at least AFAIK) so we are calling all user provided functions in such case. We are doing this as soon client is reconnected to zookeeper. We simulate situation when change occured to all nodes application was interested in. That is what fireAllWatchFunctions function do. Is it somewhat clear?
        Hide
        Mahadev konar added a comment - - edited

        lukasz, can you elaborate on what this code is doing?

        Mar 16 14:21:49 XXXXXXXXXX-c1-mbox-2 XXXXXXXXXX_mbox[25112]: DEBUG: ZooKeeper: ZooKeeper::execWatch() executing watch
        Mar 16 14:21:49 XXXXXXXXXX-c1-mbox-2 XXXXXXXXXX_mbox[25112]: INFO: ZooKeeper: ZooKeeper::watcherFunHandleSessionEvent() state change to 3
        Mar 16 14:21:49 XXXXXXXXXX-c1-mbox-2 XXXXXXXXXX_mbox[25112]: INFO: ZooKeeper: ZooKeeper::fireAllWatchFunctions() calling all watches
        

        which all watches are your firing?

        Show
        Mahadev konar added a comment - - edited lukasz, can you elaborate on what this code is doing? Mar 16 14:21:49 XXXXXXXXXX-c1-mbox-2 XXXXXXXXXX_mbox[25112]: DEBUG: ZooKeeper: ZooKeeper::execWatch() executing watch Mar 16 14:21:49 XXXXXXXXXX-c1-mbox-2 XXXXXXXXXX_mbox[25112]: INFO: ZooKeeper: ZooKeeper::watcherFunHandleSessionEvent() state change to 3 Mar 16 14:21:49 XXXXXXXXXX-c1-mbox-2 XXXXXXXXXX_mbox[25112]: INFO: ZooKeeper: ZooKeeper::fireAllWatchFunctions() calling all watches which all watches are your firing?
        Hide
        Lukasz Osipiuk added a comment -

        ZooKeeper::getChildren() path='/ggmboxlocks/16314'; callback error='session moved to another server, so operation is ignored' logs are printed using
        log_error(logger, "ZooKeeper::getChildren() path='%s'; callback error='%s'", path.c_str(), zerror(retval)) where retval is rc code from callback.

        Show
        Lukasz Osipiuk added a comment - ZooKeeper::getChildren() path='/ggmboxlocks/16314'; callback error='session moved to another server, so operation is ignored' logs are printed using log_error(logger, "ZooKeeper::getChildren() path='%s'; callback error='%s'", path.c_str(), zerror(retval)) where retval is rc code from callback.

          People

          • Assignee:
            Patrick Hunt
            Reporter:
            Lukasz Osipiuk
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development