Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.3
    • Fix Version/s: 3.5.0
    • Component/s: server

      Description

      This improvement is in the bucket of making ZooKeeper work at a large scale. We are planning on having about a 1 million clients connect to a ZooKeeper ensemble through a set of 50-100 observers. Majority of these clients are read only - ie they do not do any updates or create ephemeral nodes.

      In ZooKeeper today, the client creates a session and the session creation is handled like any other update. In the above use case, the session create/drop workload can easily overwhelm an ensemble. The following is a proposal for a "local session", to support a larger number of connections.

      1. The idea is to introduce a new type of session - "local" session. A "local" session doesn't have a full functionality of a normal session.
      2. Local sessions cannot create ephemeral nodes.
      3. Once a local session is lost, you cannot re-establish it using the session-id/password. The session and its watches are gone for good.
      4. When a local session connects, the session info is only maintained on the zookeeper server (in this case, an observer) that it is connected to. The leader is not aware of the creation of such a session and there is no state written to disk.
      5. The pings and expiration is handled by the server that the session is connected to.

      With the above changes, we can make ZooKeeper scale to a much larger number of clients without making the core ensemble a bottleneck.

      In terms of API, there are two options that are being considered
      1. Let the client specify at the connect time which kind of session do they want.
      2. All sessions connect as local sessions and automatically get promoted to global sessions when they do an operation that requires a global session (e.g. creating an ephemeral node)

      Chubby took the approach of lazily promoting all sessions to global, but I don't think that would work in our case, where we want to keep sessions which never create ephemeral nodes as always local. Option 2 would make it more broadly usable but option 1 would be easier to implement.

      We are thinking of implementing option 1 as the first cut. There would be a client flag, IsLocalSession (much like the current readOnly flag) that would be used to determine whether to create a local session or a global session.

      1. ZOOKEEPER-1147.patch
        148 kB
        Mahadev konar
      2. ZOOKEEPER-1147.patch
        143 kB
        Thawan Kooburat
      3. ZOOKEEPER-1147.patch
        142 kB
        Thawan Kooburat
      4. ZOOKEEPER-1147.patch
        134 kB
        Thawan Kooburat
      5. ZOOKEEPER-1147.patch
        135 kB
        Thawan Kooburat
      6. ZOOKEEPER-1147.patch
        135 kB
        Thawan Kooburat
      7. ZOOKEEPER-1147.patch
        135 kB
        Thawan Kooburat
      8. ZOOKEEPER-1147.patch
        135 kB
        Thawan Kooburat
      9. ZOOKEEPER-1147.patch
        136 kB
        Thawan Kooburat

        Issue Links

          Activity

          Vishal Kathuria created issue -
          Patrick Hunt made changes -
          Field Original Value New Value
          Labels api-change api-change scaling
          Hide
          Vishal Kathuria added a comment -

          I am just starting working on this for 3.5, so seeking comments.
          I am defining the proposed changes in terms of the C library, but they can easily be translated into corresponding Java library changes. The only change to the API is that you have to request this new optimization while creating your client (since I wanted to not change the behavior for existing clients). The optimization will delay the creation of the persistent (or global) session until this client creates an ephemeral node for the first time.

          1. zoo_init will take a flag indicating delayed persistent session creation.
          2. Server will look at this flag and create a session that is local to the server and not send a request to the leader.
          3. Server will expose a new operation - upgradeToPersistent - that will upgrade a local session to a persistent session. This is the first time that the leader will become aware of this session (assuming the client is connected to a follower)
          4. If there is a zoo_create with ephemeral node, the client will send a upgradeToPersistent request to the server before sending the create ephemeral node request. This request would be async, so I don't expect it to delay the creation of ephemeral node much.

          We can extend the behavior to be similar to chubby by having the client or the server execute upgradeToPersistent after a random time interval.

          Show
          Vishal Kathuria added a comment - I am just starting working on this for 3.5, so seeking comments. I am defining the proposed changes in terms of the C library, but they can easily be translated into corresponding Java library changes. The only change to the API is that you have to request this new optimization while creating your client (since I wanted to not change the behavior for existing clients). The optimization will delay the creation of the persistent (or global) session until this client creates an ephemeral node for the first time. 1. zoo_init will take a flag indicating delayed persistent session creation. 2. Server will look at this flag and create a session that is local to the server and not send a request to the leader. 3. Server will expose a new operation - upgradeToPersistent - that will upgrade a local session to a persistent session. This is the first time that the leader will become aware of this session (assuming the client is connected to a follower) 4. If there is a zoo_create with ephemeral node, the client will send a upgradeToPersistent request to the server before sending the create ephemeral node request. This request would be async, so I don't expect it to delay the creation of ephemeral node much. We can extend the behavior to be similar to chubby by having the client or the server execute upgradeToPersistent after a random time interval.
          Nicolas Liochon made changes -
          Link This issue relates to HBASE-5843 [ HBASE-5843 ]
          Jay Shrauner made changes -
          Assignee Jay Shrauner [ shrauner ]
          Jay Shrauner made changes -
          Assignee Jay Shrauner [ shrauner ] Thawan Kooburat [ thawan ]
          Hide
          Jacky007 added a comment -

          This is great, create session is costly in zookeeper.
          if there is any progress on the feature.

          Show
          Jacky007 added a comment - This is great, create session is costly in zookeeper. if there is any progress on the feature.
          Hide
          Thawan Kooburat added a comment -

          This will be the next item that we are going to push upstream after ZOOKEEPER-1504 and ZOOKEEPER-1505

          Show
          Thawan Kooburat added a comment - This will be the next item that we are going to push upstream after ZOOKEEPER-1504 and ZOOKEEPER-1505
          Hide
          Jacky007 added a comment -

          [quote]
          1. zoo_init will take a flag indicating delayed persistent session creation.
          2. Server will look at this flag and create a session that is local to the server and not send a request to the leader.
          3. Server will expose a new operation - upgradeToPersistent - that will upgrade a local session to a persistent session. This is the first time that the leader will become aware of this session (assuming the client is connected to a follower)
          4. If there is a zoo_create with ephemeral node, the client will send a upgradeToPersistent request to the server before sending the create ephemeral node request. This request would be async, so I don't expect it to delay the creation of ephemeral node much.
          [quote]

          There is a problem I can see.
          If A is a local session, now A wants to create a ephemeral node. A sends upgradeToPersistent async and create /eph_1 to server 1, then A is disconnected with server 1, and try to renew session on server 2. If leader receives the message as follows: renew, upgradeToPersistent, create /eph_1,then A will first get a session timeout, and finally /eph_1 is created which is unexpected.

          Show
          Jacky007 added a comment - [quote] 1. zoo_init will take a flag indicating delayed persistent session creation. 2. Server will look at this flag and create a session that is local to the server and not send a request to the leader. 3. Server will expose a new operation - upgradeToPersistent - that will upgrade a local session to a persistent session. This is the first time that the leader will become aware of this session (assuming the client is connected to a follower) 4. If there is a zoo_create with ephemeral node, the client will send a upgradeToPersistent request to the server before sending the create ephemeral node request. This request would be async, so I don't expect it to delay the creation of ephemeral node much. [quote] There is a problem I can see. If A is a local session, now A wants to create a ephemeral node. A sends upgradeToPersistent async and create /eph_1 to server 1, then A is disconnected with server 1, and try to renew session on server 2. If leader receives the message as follows: renew, upgradeToPersistent, create /eph_1,then A will first get a session timeout, and finally /eph_1 is created which is unexpected.
          Hide
          Thawan Kooburat added a comment -

          What we actually implemented and being used in our production is slightly different than what was described previously.

          1. There is no new client-side interface. All session are created as local session by default (no CreateSession request is sent to the leader).
          2. When client try to create an ephemeral node. The follower/leader will upgrade this session to a global session by issuing CreateSession request before issuing create ephemeral node.
          3. The client retains the same sessionId when upgrading from local to global session. Each server use serverId as sessionId prefix.

          We found only a few cases that the application really need global session by default. We either instruct the programmer to create ephemeral node on connection establish or turn off this feature on the server-side if possible.

          Jacky007, I think the system still behave correctly in the scenario that you described, since that ephemeral node will eventually get removed after the session timeout.

          Show
          Thawan Kooburat added a comment - What we actually implemented and being used in our production is slightly different than what was described previously. 1. There is no new client-side interface. All session are created as local session by default (no CreateSession request is sent to the leader). 2. When client try to create an ephemeral node. The follower/leader will upgrade this session to a global session by issuing CreateSession request before issuing create ephemeral node. 3. The client retains the same sessionId when upgrading from local to global session. Each server use serverId as sessionId prefix. We found only a few cases that the application really need global session by default. We either instruct the programmer to create ephemeral node on connection establish or turn off this feature on the server-side if possible. Jacky007, I think the system still behave correctly in the scenario that you described, since that ephemeral node will eventually get removed after the session timeout.
          Hide
          Jacky007 added a comment -

          Yeah, this is the best way to implement this.

          1. There is no new client-side interface. All session are created as local session by default (no CreateSession request is sent to the leader).
          2. When client try to create an ephemeral node. The follower/leader will upgrade this session to a global session by issuing CreateSession request before issuing create ephemeral node.
          3. The client retains the same sessionId when upgrading from local to global session. Each server use serverId as sessionId prefix.

          But I think it has the problem I have mentioned.

          since that ephemeral node will eventually get removed after the session timeout

          ephemeral node /eph_1 will get removed finally, but for A, it will get a session expire firstly. It is different form the original semantics which says when I get a session expire the ephemeral nodes have been removed already.

          Show
          Jacky007 added a comment - Yeah, this is the best way to implement this. 1. There is no new client-side interface. All session are created as local session by default (no CreateSession request is sent to the leader). 2. When client try to create an ephemeral node. The follower/leader will upgrade this session to a global session by issuing CreateSession request before issuing create ephemeral node. 3. The client retains the same sessionId when upgrading from local to global session. Each server use serverId as sessionId prefix. But I think it has the problem I have mentioned. since that ephemeral node will eventually get removed after the session timeout ephemeral node /eph_1 will get removed finally, but for A, it will get a session expire firstly. It is different form the original semantics which says when I get a session expire the ephemeral nodes have been removed already.
          Hide
          Jacky007 added a comment -

          Did I misunderstanding your description, please let me know Thawan Kooburat

          Show
          Jacky007 added a comment - Did I misunderstanding your description, please let me know Thawan Kooburat
          Hide
          Thawan Kooburat added a comment -

          Jacky007, your understanding is correct. The semantic is broken only for A due to race condition, but the rest of the system will see a valid ordering of A events.

          I don't think this impose much of a problem since the ephemeral node will eventually get removed. Any applications (eg. leader election, distributed locking) will make progress as long as that property hold.

          Let me know if I overlooked some usage scenarios. We have migrated a lot of our internal applications to use deployment with local session and we haven't seen incompatibility issue caused by the specific problem that you pointed out yet.

          Show
          Thawan Kooburat added a comment - Jacky007 , your understanding is correct. The semantic is broken only for A due to race condition, but the rest of the system will see a valid ordering of A events. I don't think this impose much of a problem since the ephemeral node will eventually get removed. Any applications (eg. leader election, distributed locking) will make progress as long as that property hold. Let me know if I overlooked some usage scenarios. We have migrated a lot of our internal applications to use deployment with local session and we haven't seen incompatibility issue caused by the specific problem that you pointed out yet.
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12564551 ]
          Hide
          Thawan Kooburat added a comment -

          The patch is very close to our internal branch. The main different is that we fixed ZOOKEEPER-1496 is a slightly different way and a lot of synchronized method was removed in our internal branch.

          Show
          Thawan Kooburat added a comment - The patch is very close to our internal branch. The main different is that we fixed ZOOKEEPER-1496 is a slightly different way and a lot of synchronized method was removed in our internal branch.
          Thawan Kooburat made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Thawan Kooburat added a comment -
          Show
          Thawan Kooburat added a comment - Review board: https://reviews.apache.org/r/8935/
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564551/ZOOKEEPER-1147.patch
          against trunk revision 1427034.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1332//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1332//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1332//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564551/ZOOKEEPER-1147.patch against trunk revision 1427034. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1332//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1332//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1332//console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          I started reviewing through the patch but I think we will need to add a little more details on the design to make further progress on this. There are quite a few cases that come up when we think about this, so a little more details on the design will go a long way.

          Thawan Kooburat can we add some comments on the design (dont want to make too laborious an effort) but something which explains the whole end to end design - things like:

          • when is the session created
          • does the create of ephemeral node wait on the return for create session (at the follower)
          • what happens if the create for session is sent at server A and the client disconnects to some other server B which ends up sending it again and then disconnects and connects back to server A.
          • what happens to the local session once the global session is created?

          Would you be able to write a short design for this (couple of paragraphs should suffice as a comment on the jira)?

          Show
          Mahadev konar added a comment - I started reviewing through the patch but I think we will need to add a little more details on the design to make further progress on this. There are quite a few cases that come up when we think about this, so a little more details on the design will go a long way. Thawan Kooburat can we add some comments on the design (dont want to make too laborious an effort) but something which explains the whole end to end design - things like: when is the session created does the create of ephemeral node wait on the return for create session (at the follower) what happens if the create for session is sent at server A and the client disconnects to some other server B which ends up sending it again and then disconnects and connects back to server A. what happens to the local session once the global session is created? Would you be able to write a short design for this (couple of paragraphs should suffice as a comment on the jira)?
          Hide
          Thawan Kooburat added a comment -

          Session creation flow:
          The flow for local session creation is similar to global session flow. As part of processing ConnectionRequest, a new sessionId is generated and added to local session tracker right away. The local create session request is then send down the request pipeline. Because this is marked as local session request, it wasn’t send to the leader and it only gets processed locally. Similar to global session, when the request reaches the final request processor, the session added to session tracker and ConnectResponse is sent back to the client to finish session initialization.

          Session Upgrading:
          Follower/Leader/Observer RequestProcessor intercepts a client request before entering the pipeline. If it is a request to create an ephemeral node, it will generate a create session request and push it into the pipeline before the create node request without waiting for the create session request to complete. The session is considered a global session right away on that machine, so a subsequent create ephemeral node request won’t trigger upgrade sequence

          Session Validation:
          When client try to reconnect, if the server found that it existing in the local session tracker, the client will reconnect right away. Otherwise, the follower/observer will have to send a revalidate packet to the leader to validate against global session.

          Answer to specific question:

          • When is the session created
            In a current implementation it will try to create a local session when processing ConnectRequest and when createSession request reach FinalRequestProcessor. The later one probably result in noop, but I am not sure about the reason behind this design.
          • What happens if the create for session is sent at server A and the client disconnects to some other server B which ends up sending it again and then disconnects and connects back to server A.
            When a client reconnects to B, its sessionId won’t exist in B’s local session tracker. So B will send validation packet. If CreateSession issued by A is committed before validation packet arrive the client will be able to connect. Otherwise, the client will get session expired because the quorum hasn’t know about this session yet.
            If the client also tries to connect back to A again, the session is already removed from local session tracker. So A will need to send a validation packet to the leader. The outcome should be the same as B depending on the timing of the request.

          Other note:
          From our experience so far, application that use local session will see higher rate of session expire. This is because automatic session failover is no longer available. So the application need to handle this graceful and recreate ZooKeeper handle.

          If the application don't want to do that, it can easily create an ephemeral node in order to upgrade to global session and get the automatic session failover like before.

          Show
          Thawan Kooburat added a comment - Session creation flow: The flow for local session creation is similar to global session flow. As part of processing ConnectionRequest, a new sessionId is generated and added to local session tracker right away. The local create session request is then send down the request pipeline. Because this is marked as local session request, it wasn’t send to the leader and it only gets processed locally. Similar to global session, when the request reaches the final request processor, the session added to session tracker and ConnectResponse is sent back to the client to finish session initialization. Session Upgrading: Follower/Leader/Observer RequestProcessor intercepts a client request before entering the pipeline. If it is a request to create an ephemeral node, it will generate a create session request and push it into the pipeline before the create node request without waiting for the create session request to complete. The session is considered a global session right away on that machine, so a subsequent create ephemeral node request won’t trigger upgrade sequence Session Validation: When client try to reconnect, if the server found that it existing in the local session tracker, the client will reconnect right away. Otherwise, the follower/observer will have to send a revalidate packet to the leader to validate against global session. Answer to specific question: When is the session created In a current implementation it will try to create a local session when processing ConnectRequest and when createSession request reach FinalRequestProcessor. The later one probably result in noop, but I am not sure about the reason behind this design. What happens if the create for session is sent at server A and the client disconnects to some other server B which ends up sending it again and then disconnects and connects back to server A. When a client reconnects to B, its sessionId won’t exist in B’s local session tracker. So B will send validation packet. If CreateSession issued by A is committed before validation packet arrive the client will be able to connect. Otherwise, the client will get session expired because the quorum hasn’t know about this session yet. If the client also tries to connect back to A again, the session is already removed from local session tracker. So A will need to send a validation packet to the leader. The outcome should be the same as B depending on the timing of the request. Other note: From our experience so far, application that use local session will see higher rate of session expire. This is because automatic session failover is no longer available. So the application need to handle this graceful and recreate ZooKeeper handle. If the application don't want to do that, it can easily create an ephemeral node in order to upgrade to global session and get the automatic session failover like before.
          Hide
          Mahadev konar added a comment -

          Thawan Kooburat this helps. Thanks for the information. I still have a couple of more questions:

          • Will a read only client always get a session expiration if a disconnect happens even though its not tried all the other servers?
          • Is the local session id the same as global session id when its created (I mean as the long value)? If its the same I think we have a problem with the shifting of client between servers..

          When a client reconnects to B, its sessionId won’t exist in B’s local session tracker. So B will send validation packet. If CreateSession issued by A is committed before validation packet arrive the client will be able to connect. Otherwise, the client will get session expired because the quorum hasn’t know about this session yet. If the client also tries to connect back to A again, the session is already removed from local session tracker. So A will need to send a validation packet to the leader. The outcome should be the same as B depending on the timing of the request.

          Show
          Mahadev konar added a comment - Thawan Kooburat this helps. Thanks for the information. I still have a couple of more questions: Will a read only client always get a session expiration if a disconnect happens even though its not tried all the other servers? Is the local session id the same as global session id when its created (I mean as the long value)? If its the same I think we have a problem with the shifting of client between servers.. When a client reconnects to B, its sessionId won’t exist in B’s local session tracker. So B will send validation packet. If CreateSession issued by A is committed before validation packet arrive the client will be able to connect. Otherwise, the client will get session expired because the quorum hasn’t know about this session yet. If the client also tries to connect back to A again, the session is already removed from local session tracker. So A will need to send a validation packet to the leader. The outcome should be the same as B depending on the timing of the request.
          Hide
          Thawan Kooburat added a comment -
          • Will a read only client always get a session expiration if a disconnect happens even though its not tried all the other servers?

          This could be future enhancement. Since we don’t change client library behavior, the client will see session expired when it connect to other servers within the quorum since the quorum don't know about this session. The optimization is that the client should try to reconnect back to the original server first if it know that it is a local session.

          • Is the local session id the same as global session id when its created?

          Yes, a session retains the same ID when it is upgraded from local session to global session. I think this is desirable. Can you elaborate why this may cause problem?

          Show
          Thawan Kooburat added a comment - Will a read only client always get a session expiration if a disconnect happens even though its not tried all the other servers? This could be future enhancement. Since we don’t change client library behavior, the client will see session expired when it connect to other servers within the quorum since the quorum don't know about this session. The optimization is that the client should try to reconnect back to the original server first if it know that it is a local session. Is the local session id the same as global session id when its created? Yes, a session retains the same ID when it is upgraded from local session to global session. I think this is desirable. Can you elaborate why this may cause problem?
          Hide
          Mahadev konar added a comment - - edited

          Yes, a session retains the same ID when it is upgraded from local session to global session. I think this is desirable. Can you elaborate why this may cause problem?

          Yes its desirable. Before I comment on what I think might be wrong, when does the server who has the local sessionid remove it from its data structures? Is it when it gets a create session in final request processor? Until then the session is a local session?

          Show
          Mahadev konar added a comment - - edited Yes, a session retains the same ID when it is upgraded from local session to global session. I think this is desirable. Can you elaborate why this may cause problem? Yes its desirable. Before I comment on what I think might be wrong, when does the server who has the local sessionid remove it from its data structures? Is it when it gets a create session in final request processor? Until then the session is a local session?
          Hide
          Thawan Kooburat added a comment -

          I think I know where you are leading to. When a server try to upgrade a local session, it remove the session from local session list and add it to the global session list right away (Even before the CreateSession request is sent to the leader). So this server will know about this session while it is being upgraded. I am not sure if this design is intentional or a result of other bug fixes. From the perspective of this issue, a better way would be to add additional state when session is being upgraded (like isClosing()).

          Show
          Thawan Kooburat added a comment - I think I know where you are leading to. When a server try to upgrade a local session, it remove the session from local session list and add it to the global session list right away (Even before the CreateSession request is sent to the leader). So this server will know about this session while it is being upgraded. I am not sure if this design is intentional or a result of other bug fixes. From the perspective of this issue, a better way would be to add additional state when session is being upgraded (like isClosing()).
          Hide
          Mahadev konar added a comment -

          Thawan Kooburat I thin the above scenario is ok. The only issue I think we have is the sensitive local sessions. Since we have had too many issues with disconnects and session expiry I think this might cause more issues than we already have. Is there something we can do here? I cant seem to find a way around it without doing client side changes.

          Show
          Mahadev konar added a comment - Thawan Kooburat I thin the above scenario is ok. The only issue I think we have is the sensitive local sessions. Since we have had too many issues with disconnects and session expiry I think this might cause more issues than we already have. Is there something we can do here? I cant seem to find a way around it without doing client side changes.
          Hide
          Thawan Kooburat added a comment -

          Can you elaborate on the concern regarding disconnect and session expiry that local session may introduce?

          From my current perspective, the correctness of system is still preserve with the introduction of local sessions. The problem is that client behavior is not optimal at the moment for local session. For example, if a client know that it is only a local session, it only need to retry to connect to the original server in the event of disconnect. If it is not successful, it can deliver session expire to application right away since no other servers will know about this session.

          Without this optimization, client probably going to get session expired without be able to restore the session if the root cause of the disconnect is just a temporal network failure or operation timeout on the server side. However, for local session which doesn't have ephemeral nodes associated with the session, restoring the session isn't critical for most applications.

          After more internal discussion regarding the session creation flow. We felt the current flow is probably not the right way to do it. However, the feature is very stable for several months now ,so making a major change may introduce a new bug. So we will let you guys decide which approach we want to take.

          Show
          Thawan Kooburat added a comment - Can you elaborate on the concern regarding disconnect and session expiry that local session may introduce? From my current perspective, the correctness of system is still preserve with the introduction of local sessions. The problem is that client behavior is not optimal at the moment for local session. For example, if a client know that it is only a local session, it only need to retry to connect to the original server in the event of disconnect. If it is not successful, it can deliver session expire to application right away since no other servers will know about this session. Without this optimization, client probably going to get session expired without be able to restore the session if the root cause of the disconnect is just a temporal network failure or operation timeout on the server side. However, for local session which doesn't have ephemeral nodes associated with the session, restoring the session isn't critical for most applications. After more internal discussion regarding the session creation flow. We felt the current flow is probably not the right way to do it. However, the feature is very stable for several months now ,so making a major change may introduce a new bug. So we will let you guys decide which approach we want to take.
          Hide
          Patrick Hunt added a comment -

          My initial thoughts after a quick first pass:

          We should add docs as part of the patch.

          What's the reason for having the config option to disable local session upgrade? Did you see a specific need for this?

          Given some of the discussion above (more frequent expiry), it seems to me that we should make sessions global by default and provide an option to ZooKeeper(...) to enable local session usage. On the flip side making sessions local by default will result in everyone seeing an improvement w/o need for code changes (client side I'm saying). That said, it seems that for most people global vs local session is not an issue. For the small subset of users where it is an issue they could enable the local session support via a parameter, a very simple code change for users on the client. Thoughts?

          The timeouts in the tests are too low in my experience. Use org.apache.zookeeper.test.ClientBase.CONNECTION_TIMEOUT unless there's some reason not to? (some of the hardware we run testing on is virtualized, can make things very slow)

          The change to watchertest - move this to it's own jira. We can can get that reviewed/committed quickly. Likely we need this for more than just trunk.

          Show
          Patrick Hunt added a comment - My initial thoughts after a quick first pass: We should add docs as part of the patch. What's the reason for having the config option to disable local session upgrade? Did you see a specific need for this? Given some of the discussion above (more frequent expiry), it seems to me that we should make sessions global by default and provide an option to ZooKeeper(...) to enable local session usage. On the flip side making sessions local by default will result in everyone seeing an improvement w/o need for code changes (client side I'm saying). That said, it seems that for most people global vs local session is not an issue. For the small subset of users where it is an issue they could enable the local session support via a parameter, a very simple code change for users on the client. Thoughts? The timeouts in the tests are too low in my experience. Use org.apache.zookeeper.test.ClientBase.CONNECTION_TIMEOUT unless there's some reason not to? (some of the hardware we run testing on is virtualized, can make things very slow) The change to watchertest - move this to it's own jira. We can can get that reviewed/committed quickly. Likely we need this for more than just trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12564551/ZOOKEEPER-1147.patch
          against trunk revision 1441922.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1389//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1389//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1389//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12564551/ZOOKEEPER-1147.patch against trunk revision 1441922. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1389//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1389//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1389//console This message is automatically generated.
          Hide
          Thawan Kooburat added a comment -

          What's the reason for having the config option to disable local session upgrade? Did you see a specific need for this?

          In our largest deployment where we have a very large number of clients, we know that clients connecting via the observers is supposed to be local session only. So this is more like a safeguard against someone accidentally create lots of ephemeral nodes and global sessions.

          Given some of the discussion above (more frequent expiry), it seems to me that we should make sessions global by default and provide an option to ZooKeeper(...) to enable local session usage. On the flip side making sessions local by default will result in everyone seeing an improvement w/o need for code changes (client side I'm saying). That said, it seems that for most people global vs local session is not an issue. For the small subset of users where it is an issue they could enable the local session support via a parameter, a very simple code change for users on the client. Thoughts?

          Vast majority of our applications works fine when they migrated onto our deployment that have local session enabled by default. However, explaining the subtle difference between local and global session to users is bit tricky. So I think for general use, local session can be disabled by default.

          I agree that it will be nice to have an explicit interface to request for global or local session. We didn't have immediate need to add this because the current workload is sufficient and doesn't impact performance that much. The current mechanism also backward compatible with older clients.

          The timeouts in the tests are too low in my experience. Use org.apache.zookeeper.test.ClientBase.CONNECTION_TIMEOUT unless there's some reason not to? (some of the hardware we run testing on is virtualized, can make things very slow)

          We can increase that

          The change to watchertest - move this to it's own jira. We can can get that reviewed/committed quickly. Likely we need this for more than just trunk.

          Someone told me that there is a flag to disable random test order in junit. Anyway, I can move that into a separate jira.

          Show
          Thawan Kooburat added a comment - What's the reason for having the config option to disable local session upgrade? Did you see a specific need for this? In our largest deployment where we have a very large number of clients, we know that clients connecting via the observers is supposed to be local session only. So this is more like a safeguard against someone accidentally create lots of ephemeral nodes and global sessions. Given some of the discussion above (more frequent expiry), it seems to me that we should make sessions global by default and provide an option to ZooKeeper(...) to enable local session usage. On the flip side making sessions local by default will result in everyone seeing an improvement w/o need for code changes (client side I'm saying). That said, it seems that for most people global vs local session is not an issue. For the small subset of users where it is an issue they could enable the local session support via a parameter, a very simple code change for users on the client. Thoughts? Vast majority of our applications works fine when they migrated onto our deployment that have local session enabled by default. However, explaining the subtle difference between local and global session to users is bit tricky. So I think for general use, local session can be disabled by default. I agree that it will be nice to have an explicit interface to request for global or local session. We didn't have immediate need to add this because the current workload is sufficient and doesn't impact performance that much. The current mechanism also backward compatible with older clients. The timeouts in the tests are too low in my experience. Use org.apache.zookeeper.test.ClientBase.CONNECTION_TIMEOUT unless there's some reason not to? (some of the hardware we run testing on is virtualized, can make things very slow) We can increase that The change to watchertest - move this to it's own jira. We can can get that reviewed/committed quickly. Likely we need this for more than just trunk. Someone told me that there is a flag to disable random test order in junit. Anyway, I can move that into a separate jira.
          Hide
          Thawan Kooburat added a comment -

          nit: the current workload -> the current workaround

          Show
          Thawan Kooburat added a comment - nit: the current workload -> the current workaround
          Thawan Kooburat made changes -
          Link This issue depends on ZOOKEEPER-1648 [ ZOOKEEPER-1648 ]
          Hide
          Thawan Kooburat added a comment -

          Rebase and fix comments

          • Use ConcurrentMap instead of ConccurentHashMap
          • Move WatcherTest fix to a separate patch
          Show
          Thawan Kooburat added a comment - Rebase and fix comments Use ConcurrentMap instead of ConccurentHashMap Move WatcherTest fix to a separate patch
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12569875 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12569875/ZOOKEEPER-1147.patch
          against trunk revision 1446831.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 31 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1393//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12569875/ZOOKEEPER-1147.patch against trunk revision 1446831. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 31 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1393//console This message is automatically generated.
          Hide
          Thawan Kooburat added a comment -

          Fix patch file creation

          Show
          Thawan Kooburat added a comment - Fix patch file creation
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12569877 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12569877/ZOOKEEPER-1147.patch
          against trunk revision 1446831.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 36 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1395//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1395//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1395//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12569877/ZOOKEEPER-1147.patch against trunk revision 1446831. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 36 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1395//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1395//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1395//console This message is automatically generated.
          Hide
          Thawan Kooburat added a comment -

          use ClientBase.CONNECTION_TIMEOUT for connection timeout when possible.

          Only for LocalSessionRequestTest still use low connection timeout (4 seconds). This is because we need to wait until local session is expired by the server. Long session timeout will make this test run for a much longer time (2 mins vs 30 secs)

          Show
          Thawan Kooburat added a comment - use ClientBase.CONNECTION_TIMEOUT for connection timeout when possible. Only for LocalSessionRequestTest still use low connection timeout (4 seconds). This is because we need to wait until local session is expired by the server. Long session timeout will make this test run for a much longer time (2 mins vs 30 secs)
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12570024 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12570024/ZOOKEEPER-1147.patch
          against trunk revision 1447621.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 36 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1399//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1399//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1399//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570024/ZOOKEEPER-1147.patch against trunk revision 1447621. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 36 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1399//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1399//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1399//console This message is automatically generated.
          Hide
          Thawan Kooburat added a comment -

          Upload the correct file, and fix additional find bugs warnings

          Show
          Thawan Kooburat added a comment - Upload the correct file, and fix additional find bugs warnings
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12570056 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12570056/ZOOKEEPER-1147.patch
          against trunk revision 1447982.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 36 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1401//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1401//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1401//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570056/ZOOKEEPER-1147.patch against trunk revision 1447982. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 36 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1401//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1401//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1401//console This message is automatically generated.
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12571753 ]
          Hide
          Thawan Kooburat added a comment -

          Regarding checkSession() and setOwner() for LearnerSessionTracker. After read through the code again, I think the original intention is to be able to reject in-flight requests when local session expired. Currently, it is possible for in-flight request for an expired local session to go to the leader. However, this should not affect the correctness of the system since local session can only modify persistent znodes.

          So I think we should keep those logics in LearnerSessionTracker, then we can fix it in a separate jira.

          Show
          Thawan Kooburat added a comment - Regarding checkSession() and setOwner() for LearnerSessionTracker. After read through the code again, I think the original intention is to be able to reject in-flight requests when local session expired. Currently, it is possible for in-flight request for an expired local session to go to the leader. However, this should not affect the correctness of the system since local session can only modify persistent znodes. So I think we should keep those logics in LearnerSessionTracker, then we can fix it in a separate jira.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12571753/ZOOKEEPER-1147.patch
          against trunk revision 1448007.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 36 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1416//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1416//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1416//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12571753/ZOOKEEPER-1147.patch against trunk revision 1448007. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 36 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1416//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1416//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1416//console This message is automatically generated.
          Hide
          Thawan Kooburat added a comment -

          Fixed a recently found bug that cause duplicate CreateSession request to be issued when local session upgrade was made by the learner. This doesn't affect the correctness but cause the quorum to see more txns than necessary.

          The root cause is that the request from the follower enter leader pipeline via ZooKeeperServer.submitRequest(). This cause the leader to perform session upgrade again.

          The fix is to submit request from the learner directly to PrepRequestProcessor. The session upgrade logic is moved to LeaderRequestProcessor. Only the requests that the leader received directly from the clients will go through this processor.

          Show
          Thawan Kooburat added a comment - Fixed a recently found bug that cause duplicate CreateSession request to be issued when local session upgrade was made by the learner. This doesn't affect the correctness but cause the quorum to see more txns than necessary. The root cause is that the request from the follower enter leader pipeline via ZooKeeperServer.submitRequest(). This cause the leader to perform session upgrade again. The fix is to submit request from the learner directly to PrepRequestProcessor. The session upgrade logic is moved to LeaderRequestProcessor. Only the requests that the leader received directly from the clients will go through this processor.
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12572440 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12572440/ZOOKEEPER-1147.patch
          against trunk revision 1448007.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1424//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1424//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1424//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572440/ZOOKEEPER-1147.patch against trunk revision 1448007. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1424//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1424//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1424//console This message is automatically generated.
          Hide
          Thawan Kooburat added a comment -

          Is there any additional change request for this patch other than the latest set of requests from Edward?. So I can create another revision and hopefully we can commit it. I would like to close this JIRA before moving on to other pending JIRA which is assigned to me.

          Show
          Thawan Kooburat added a comment - Is there any additional change request for this patch other than the latest set of requests from Edward?. So I can create another revision and hopefully we can commit it. I would like to close this JIRA before moving on to other pending JIRA which is assigned to me.
          Hide
          Thawan Kooburat added a comment -
          • Rebase with current trunk (merge with dynamic config patch)
          • Address comments provided by Edward
          • Change error code due to conflict with trunk
          Show
          Thawan Kooburat added a comment - Rebase with current trunk (merge with dynamic config patch) Address comments provided by Edward Change error code due to conflict with trunk
          Thawan Kooburat made changes -
          Attachment ZOOKEEPER-1147.patch [ 12580207 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12580207/ZOOKEEPER-1147.patch
          against trunk revision 1463329.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1465//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1465//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1465//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580207/ZOOKEEPER-1147.patch against trunk revision 1463329. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1465//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1465//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1465//console This message is automatically generated.
          Gavin made changes -
          Link This issue depends on ZOOKEEPER-1648 [ ZOOKEEPER-1648 ]
          Gavin made changes -
          Link This issue depends upon ZOOKEEPER-1648 [ ZOOKEEPER-1648 ]
          Hide
          Raul Gutierrez Segales added a comment -

          Thawan Kooburat: how is this meant to work with a rolling update when enabling local sessions? If the leader doesn't have local sessions enabled then all writes from local sessions will fail with SessionExpired (because they'll be unknown to the leader) - right?

          The only way I could get a rolling update to work is with (something like) this:

          http://www.itevenworks.net/~rgs/patches/0001-Add-support-to-enable-disable-sessions-validations.patch

          I.e.: adding a way to temporarily disable sessions validations whilst you are enabling local sessions on the cluster.
          We should add some documentation about the right way to this. Thoughts? It would be nice to get this merged.

          Show
          Raul Gutierrez Segales added a comment - Thawan Kooburat : how is this meant to work with a rolling update when enabling local sessions? If the leader doesn't have local sessions enabled then all writes from local sessions will fail with SessionExpired (because they'll be unknown to the leader) - right? The only way I could get a rolling update to work is with (something like) this: http://www.itevenworks.net/~rgs/patches/0001-Add-support-to-enable-disable-sessions-validations.patch I.e.: adding a way to temporarily disable sessions validations whilst you are enabling local sessions on the cluster. We should add some documentation about the right way to this. Thoughts? It would be nice to get this merged.
          Hide
          Thawan Kooburat added a comment -

          Ah... Thanks for catching this. We never have to perform upgrade on ensemble that use local session or the upgrade was short enough that we didn't noticed the problem. As you proposed, the upgrade path will require 2 rolling restarts to enable this feature.

          The patch that you proposed should work. I think it should be possible to just modify PrepRequestProcessor so that we can use local session validation path without actually turning on local session. This should make the patch smaller.

          For additional safety, it might be better to use Java property to control this parameter. So the value survive server restarts in case there is a leader election during the upgrade.

          Show
          Thawan Kooburat added a comment - Ah... Thanks for catching this. We never have to perform upgrade on ensemble that use local session or the upgrade was short enough that we didn't noticed the problem. As you proposed, the upgrade path will require 2 rolling restarts to enable this feature. The patch that you proposed should work. I think it should be possible to just modify PrepRequestProcessor so that we can use local session validation path without actually turning on local session. This should make the patch smaller. For additional safety, it might be better to use Java property to control this parameter. So the value survive server restarts in case there is a leader election during the upgrade.
          Thawan Kooburat made changes -
          Link This issue is depended upon by ZOOKEEPER-1607 [ ZOOKEEPER-1607 ]
          Hide
          Raul Gutierrez Segales added a comment -

          fwiw, we are using this patch in prod at Twitter so it would be awesome to have this merged. Besides what I mentioned in my previous comment (having a way to do rolling upgrades to enable local sessions) is there anything else that's left to get this merged?

          Show
          Raul Gutierrez Segales added a comment - fwiw, we are using this patch in prod at Twitter so it would be awesome to have this merged. Besides what I mentioned in my previous comment (having a way to do rolling upgrades to enable local sessions) is there anything else that's left to get this merged?
          Hide
          Flavio Junqueira added a comment -

          A few people have reviewed this patch in the past, so if anyone could +1 it, we can consider committing it. Otherwise, I'll review it as soon as I have a chance. I'd like to have a look at it anyway before it gets in, but one or more +1 from Patrick Hunt, Mahadev konar, and Edward Ribeiro would be helpful.

          Show
          Flavio Junqueira added a comment - A few people have reviewed this patch in the past, so if anyone could +1 it, we can consider committing it. Otherwise, I'll review it as soon as I have a chance. I'd like to have a look at it anyway before it gets in, but one or more +1 from Patrick Hunt , Mahadev konar , and Edward Ribeiro would be helpful.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12580207/ZOOKEEPER-1147.patch
          against trunk revision 1503101.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 39 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1517//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1517//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1517//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580207/ZOOKEEPER-1147.patch against trunk revision 1503101. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1517//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1517//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1517//console This message is automatically generated.
          Hide
          Raul Gutierrez Segales added a comment -

          ping? any progress to get this merged?

          Show
          Raul Gutierrez Segales added a comment - ping? any progress to get this merged?
          Hide
          Mahadev konar added a comment -

          Thawan/others,
          Sorry abt the delay on this patch. Just spent sometime reviewing the patch. It looks good to me. There is minor issues with applying the patch - just fixed that. We can run through hudson once and commit.

          Show
          Mahadev konar added a comment - Thawan/others, Sorry abt the delay on this patch. Just spent sometime reviewing the patch. It looks good to me. There is minor issues with applying the patch - just fixed that. We can run through hudson once and commit.
          Mahadev konar made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Mahadev konar added a comment -

          Minor conflict with the current patch fails on applying with QuorumPeerMain.java - attaching a new one which fixes the conflict.

          Show
          Mahadev konar added a comment - Minor conflict with the current patch fails on applying with QuorumPeerMain.java - attaching a new one which fixes the conflict.
          Mahadev konar made changes -
          Attachment ZOOKEEPER-1147.patch [ 12607325 ]
          Hide
          Mahadev konar added a comment -

          The difference in size with the prev patch is mostly because of using git to generate the patch.

          Show
          Mahadev konar added a comment - The difference in size with the prev patch is mostly because of using git to generate the patch.
          Mahadev konar made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Mahadev konar added a comment -

          Flavio Junqueira looks like the patch is ready to get in. You want to look through before we commit?

          Show
          Mahadev konar added a comment - Flavio Junqueira looks like the patch is ready to get in. You want to look through before we commit?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12607325/ZOOKEEPER-1147.patch
          against trunk revision 1530166.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 33 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1655//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1655//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1655//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12607325/ZOOKEEPER-1147.patch against trunk revision 1530166. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 33 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1655//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1655//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1655//console This message is automatically generated.
          Hide
          Flavio Junqueira added a comment -

          This looks great, +1.

          Show
          Flavio Junqueira added a comment - This looks great, +1.
          Thawan Kooburat made changes -
          Link This issue is depended upon by ZOOKEEPER-1787 [ ZOOKEEPER-1787 ]
          Hide
          Thawan Kooburat added a comment -

          Committed to trunk. Thanks a lot everyone.

          Show
          Thawan Kooburat added a comment - Committed to trunk. Thanks a lot everyone.
          Hide
          Flavio Junqueira added a comment -

          Committed revision: 1530781

          Show
          Flavio Junqueira added a comment - Committed revision: 1530781
          Hide
          Mahadev konar added a comment -

          Nice!! Thanks Thawan Kooburat Incredible work!

          Show
          Mahadev konar added a comment - Nice!! Thanks Thawan Kooburat Incredible work!
          Mahadev konar made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Patrick Hunt added a comment -

          Kudos! Great feature addition.

          Show
          Patrick Hunt added a comment - Kudos! Great feature addition.
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in ZooKeeper-trunk #2084 (See https://builds.apache.org/job/ZooKeeper-trunk/2084/)
          ZOOKEEPER-1147. Add support for local sessions (Jay Shrauner, thawan via thawan) (thawan: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1530781)

          • /zookeeper/trunk/CHANGES.txt
          • /zookeeper/trunk/src/c/include/zookeeper.h
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/KeeperException.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/cli/CreateCommand.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/Request.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/SessionTracker.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/TraceFormatter.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/FollowerRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderSessionTracker.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LocalSessionTracker.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/ObserverRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/ProposalRequestProcessor.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/ReadOnlyZooKeeperServer.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/UpgradeableSessionTracker.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/PrepRequestProcessorTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/DuplicateLocalSessionUpgradeTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LeaderSessionTrackerTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LocalSessionRequestTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LocalSessionsOnlyTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumBase.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumUtil.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ReadOnlyModeTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SessionTrackerCheckTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SessionUpgradeTest.java
          Show
          Hudson added a comment - SUCCESS: Integrated in ZooKeeper-trunk #2084 (See https://builds.apache.org/job/ZooKeeper-trunk/2084/ ) ZOOKEEPER-1147 . Add support for local sessions (Jay Shrauner, thawan via thawan) (thawan: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1530781 ) /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/c/include/zookeeper.h /zookeeper/trunk/src/java/main/org/apache/zookeeper/KeeperException.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/cli/CreateCommand.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/Request.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/SessionTracker.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/TraceFormatter.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/FollowerRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderSessionTracker.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LeaderZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LocalSessionTracker.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/ObserverRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/ProposalRequestProcessor.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/ReadOnlyZooKeeperServer.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/UpgradeableSessionTracker.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/PrepRequestProcessorTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/DuplicateLocalSessionUpgradeTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LeaderSessionTrackerTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LocalSessionRequestTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LocalSessionsOnlyTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumBase.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumUtil.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ReadOnlyModeTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SessionTrackerCheckTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SessionUpgradeTest.java
          Raul Gutierrez Segales made changes -
          Link This issue relates to ZOOKEEPER-1851 [ ZOOKEEPER-1851 ]

            People

            • Assignee:
              Thawan Kooburat
              Reporter:
              Vishal Kathuria
            • Votes:
              3 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 840h
                840h
                Remaining:
                Remaining Estimate - 840h
                840h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development