ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1436

Add ZOO_TIMED_OUT_STATE sesion event to notify client about timeout during reconnection

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.4.3
    • Fix Version/s: None
    • Component/s: c client
    • Labels:

      Description

      The zookeeper c client knows how long its session will last, and periodically pings in order to keep that session alive. However, if it loses connection, it hops from ensemble member to ensemble member trying to reform the session - even after the session timeout expires.

      This patch at a new session event (ZOO_TIMED_OUT_STATE) that notifies the user that the session timeout has passed, and we have been unable to reconnect. The event is one-shot per disconnection and get generated from the C-client library itself. The server has no knowledge of this event.

      Example use cases:
      1. Client can try to reconnect to a different set of observers if it unable to connect to the original set of observers.

      2. Client can quickly stop acting as an active server, since other server may already taken over the active role while it is trying to reconnect.

      1. ZOOKEEPER-1436.patch
        13 kB
        Thawan Kooburat
      2. ZOOKEEPER-1436.patch
        13 kB
        Thawan Kooburat

        Activity

        Hide
        Thawan Kooburat added a comment -

        Yes, we use this as a notification only. We don't change any behavior of the client.

        We mainly use this so that master can stop servicing request and let the slave take over. So this event give a sufficient information for us since we also set session timeout to be quite low (4 seconds). I agree that your solution will allow client to implement further optimization to retain the session.

        Show
        Thawan Kooburat added a comment - Yes, we use this as a notification only. We don't change any behavior of the client. We mainly use this so that master can stop servicing request and let the slave take over. So this event give a sufficient information for us since we also set session timeout to be quite low (4 seconds). I agree that your solution will allow client to implement further optimization to retain the session.
        Hide
        Mark Gius added a comment -

        I'd like to confirm my understanding of this patch, and propose a possible useful extension that may be better off in its own ticket.

        Confirm: this is purely a notification for the caller. The ZK client code is still looping around trying to reconnect just as before. In other words, if I ignore ZOO_TIMED_OUT_STATE events, the client behavior is unchanged.

        Possible extension: For zookeeper server clusters which are dynamic, it can be very useful to know before a possible EXPIRE event that the client is having trouble connecting. This gives the caller of the client the chance to re-generate their ZK endpoint list and attempt to recover their session on a new set of hosts. I suppose this might expose itself as some sort of ZOO_TIMEOUT_WARNING or such that fires when the client has been disconnected for SESSION_EXPIRE / 2 or some such thing.

        Otherwise, I really like this as it gives the caller more information with which to make informed decisions.

        Show
        Mark Gius added a comment - I'd like to confirm my understanding of this patch, and propose a possible useful extension that may be better off in its own ticket. Confirm: this is purely a notification for the caller. The ZK client code is still looping around trying to reconnect just as before. In other words, if I ignore ZOO_TIMED_OUT_STATE events, the client behavior is unchanged. Possible extension: For zookeeper server clusters which are dynamic, it can be very useful to know before a possible EXPIRE event that the client is having trouble connecting. This gives the caller of the client the chance to re-generate their ZK endpoint list and attempt to recover their session on a new set of hosts. I suppose this might expose itself as some sort of ZOO_TIMEOUT_WARNING or such that fires when the client has been disconnected for SESSION_EXPIRE / 2 or some such thing. Otherwise, I really like this as it gives the caller more information with which to make informed decisions.
        Hide
        Thawan Kooburat added a comment -

        Updated patch file

        Show
        Thawan Kooburat added a comment - Updated patch file
        Hide
        Thawan Kooburat added a comment -

        Updated patch file

        Show
        Thawan Kooburat added a comment - Updated patch file
        Hide
        Patrick Hunt added a comment -

        Thawan, please regenerate the patch, use "--no-prefix" with git, otherwise our automated testing isn't able to apply the patch. Thanks!

        Show
        Patrick Hunt added a comment - Thawan, please regenerate the patch, use "--no-prefix" with git, otherwise our automated testing isn't able to apply the patch. Thanks!
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12520346/ZOOKEEPER-1436.patch
        against trunk revision 1302736.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 9 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1017//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520346/ZOOKEEPER-1436.patch against trunk revision 1302736. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1017//console This message is automatically generated.

          People

          • Assignee:
            Thawan Kooburat
            Reporter:
            Thawan Kooburat
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development