ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1740

Zookeeper 3.3.4 loses ephemeral nodes under stress

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Not a Problem
    • Affects Version/s: 3.3.4
    • Fix Version/s: None
    • Component/s: server
    • Labels:
      None

      Description

      The current behavior of zookeeper for ephemeral nodes is that session expiration and ephemeral node deletion is not an atomic operation.

      The side-effect of the above zookeeper behavior in Kafka, for certain corner cases, is that ephemeral nodes can be lost even if the session is not expired. The sequence of events that can lead to lossy ephemeral nodes is as follows -

      1. The session expires on the client, it assumes the ephemeral nodes are deleted, so it establishes a new session with zookeeper and tries to re-create the ephemeral nodes.
      2. However, when it tries to re-create the ephemeral node,zookeeper throws back a NodeExists error code. Now this is legitimate during a session disconnect event (since zkclient automatically retries the
      operation and raises a NodeExists error). Also by design, Kafka server doesn't have multiple zookeeper clients create the same ephemeral node, so Kafka server assumes the NodeExists is normal.
      3. However, after a few seconds zookeeper deletes that ephemeral node. So from the client's perspective, even though the client has a new valid session, its ephemeral node is gone.

      This behavior is triggered due to very long fsync operations on the zookeeper leader. When the leader wakes up from such a long fsync operation, it has several sessions to expire. And the time between the session expiration and the ephemeral node deletion is magnified. Between these 2 operations, a zookeeper client can issue a ephemeral node creation operation, that could've appeared to have succeeded, but the leader later deletes the ephemeral node leading to permanent ephemeral node loss from the client's perspective.

      Thread from zookeeper mailing list: http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

      The way to reproduce this behavior is as follows -

      1. Bring up a zookeeper 3.3.4 cluster and create several sessions with ephemeral ndoes on it using zkclient. Make sure the session expiration callback is implemented and it re-registers the ephemeral node.
      2. Run the following script on the zookeeper leader -
      while true
      do
      kill -STOP $1
      sleep 8
      kill -CONT $1
      sleep 60
      done
      3. Run another script to check for existence of ephemeral nodes.

      This script shows that zookeeper loses the ephemeral nodes and the clients still have a valid session.

        Issue Links

          Activity

          Hide
          Saulius Zemaitaitis added a comment -

          Has anyone come up with a workaround for this while it is being fixed? Also, what about 3.4.x?

          Show
          Saulius Zemaitaitis added a comment - Has anyone come up with a workaround for this while it is being fixed? Also, what about 3.4.x?
          Hide
          Patrick Hunt added a comment -

          Neha Narkhede any updates on this? last I see on the mailing list thread is a comment from you that seems to indicate you have a fix:

          I haven't had the chance to reproduce this on zookeeper 3.4, but I was able
          to reproduce this on 3.3.4. Filed
          https://issues.apache.org/jira/browse/ZOOKEEPER-1740. I will try to see if
          the fix works.
          
          Show
          Patrick Hunt added a comment - Neha Narkhede any updates on this? last I see on the mailing list thread is a comment from you that seems to indicate you have a fix: I haven't had the chance to reproduce this on zookeeper 3.4, but I was able to reproduce this on 3.3.4. Filed https://issues.apache.org/jira/browse/ZOOKEEPER-1740. I will try to see if the fix works.
          Hide
          Flavio Junqueira added a comment -

          I'd like to know if this same problem happens in the 3.4 branch.

          Show
          Flavio Junqueira added a comment - I'd like to know if this same problem happens in the 3.4 branch.
          Hide
          Germán Blanco added a comment -

          Shouldn't this be a Blocker?

          Show
          Germán Blanco added a comment - Shouldn't this be a Blocker?
          Hide
          Flavio Junqueira added a comment -

          This issue has been created for the 3.3 branch and there is no further evidence that it happens in the 3.4 branch. In fact, Neha was not able to reproduce it in the 3.4 branch according to the e-mail thread posted above. I'm actually thinking about closing this issue, since users should actually consider upgrading to the 3.4 branch. Short answer is that I don't think this is a blocker.

          Show
          Flavio Junqueira added a comment - This issue has been created for the 3.3 branch and there is no further evidence that it happens in the 3.4 branch. In fact, Neha was not able to reproduce it in the 3.4 branch according to the e-mail thread posted above. I'm actually thinking about closing this issue, since users should actually consider upgrading to the 3.4 branch. Short answer is that I don't think this is a blocker.
          Hide
          Flavio Junqueira added a comment -

          I'm marking this jira as "not a problem" based on my previous comment. If anyone disagrees, please reopen it.

          Show
          Flavio Junqueira added a comment - I'm marking this jira as "not a problem" based on my previous comment. If anyone disagrees, please reopen it.
          Hide
          Germán Blanco added a comment -

          I think ZOOKEEPER-1809 is related, or maybe even a duplicate of ZOOKEEPER-1740. Could you please check?

          Show
          Germán Blanco added a comment - I think ZOOKEEPER-1809 is related, or maybe even a duplicate of ZOOKEEPER-1740 . Could you please check?
          Hide
          Shaun Senecal added a comment -

          It's possible that ZOOKEEPER-1809 is a duplicate of ZOOKEEPER-1740, but Flavio Junqueira indicated the issue wasn't reproducible on the 3.4 branch (1809 was reproduced on 3.4.5).

          This ticket indicates that the fix version is 3.4.6, so I will see if I can build that locally and reproduce the problem

          Show
          Shaun Senecal added a comment - It's possible that ZOOKEEPER-1809 is a duplicate of ZOOKEEPER-1740 , but Flavio Junqueira indicated the issue wasn't reproducible on the 3.4 branch (1809 was reproduced on 3.4.5). This ticket indicates that the fix version is 3.4.6, so I will see if I can build that locally and reproduce the problem
          Hide
          Germán Blanco added a comment -

          This was resolved as "Not a Problem" and there is no solution implemented in branch 3.4.
          You might have been able to reproduce it in the 3.4 branch with your test.

          Show
          Germán Blanco added a comment - This was resolved as "Not a Problem" and there is no solution implemented in branch 3.4. You might have been able to reproduce it in the 3.4 branch with your test.
          Hide
          Shaun Senecal added a comment -

          My understanding is that this issue was closed as "Not a problem" because it wasn't reprodicible under 3.4 at the time. However, the behaviour seems to be the same and I am able to consistently reproduce it under 3.4.5 using the app in ZOOKEEPER-1809. Perhaps this ticket can remain closed, as it pertains to 3.3, and any further investigation happens on ZOOKEEPER-1809?

          Show
          Shaun Senecal added a comment - My understanding is that this issue was closed as "Not a problem" because it wasn't reprodicible under 3.4 at the time. However, the behaviour seems to be the same and I am able to consistently reproduce it under 3.4.5 using the app in ZOOKEEPER-1809 . Perhaps this ticket can remain closed, as it pertains to 3.3, and any further investigation happens on ZOOKEEPER-1809 ?

            People

            • Assignee:
              Neha Narkhede
              Reporter:
              Neha Narkhede
            • Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development