Solr
  1. Solr
  2. SOLR-3582

Our ZooKeeper watchers respond to session events as if they are change events, creating undesirable side effects.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-BETA, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      As brought up by Trym R. Møller on the mailing list, we are responding to watcher events about connection/disconnection as if they were notifications about node changes.

      http://www.lucidimagination.com/search/document/e13ef390b88eeee2

        Activity

        Hide
        Mark Miller added a comment -

        I'm unsure of the proposed solution on the mailing list.

        On a connection event, the watch will fire - we will skip doing anything, but watches are one time events, so we will have no watch in place?

        Show
        Mark Miller added a comment - I'm unsure of the proposed solution on the mailing list. On a connection event, the watch will fire - we will skip doing anything, but watches are one time events, so we will have no watch in place?
        Hide
        Mark Miller added a comment -

        Never mind - found confirmation elsewhere that session events do not remove the watcher. The ZooKeeper programming guide does not appear very clear on this when it talks about watches being one time triggers.

        Show
        Mark Miller added a comment - Never mind - found confirmation elsewhere that session events do not remove the watcher. The ZooKeeper programming guide does not appear very clear on this when it talks about watches being one time triggers.
        Hide
        Trym Møller added a comment - - edited

        Debugging the provided test shows this behaviour as well, that is, the Watch is kept even though, its notified about disConnection and syncConnection and the Watch will first "stop" after it has been notified about a node change.

        As Mark writes on the mailing list, there might be other ZooKeeper Watchers in Solr which might add new watchers on reconnect.

        If we agree about the ZooKeeper watcher behaviour, then I think that the provided bug fix solves the problem in the LeaderElector and it can be committed to svn independently of problems with other watchers. Are there other things I can do to show, that the provided solution is the right one?

        Best regards Trym

        Show
        Trym Møller added a comment - - edited Debugging the provided test shows this behaviour as well, that is, the Watch is kept even though, its notified about disConnection and syncConnection and the Watch will first "stop" after it has been notified about a node change. As Mark writes on the mailing list, there might be other ZooKeeper Watchers in Solr which might add new watchers on reconnect. If we agree about the ZooKeeper watcher behaviour, then I think that the provided bug fix solves the problem in the LeaderElector and it can be committed to svn independently of problems with other watchers. Are there other things I can do to show, that the provided solution is the right one? Best regards Trym
        Hide
        Per Steffensen added a comment -

        Trym didnt mention it, but this is not only a negligible problem that will never cause any problems in real-world usage. Actually we discovered the problem during one of our performance/endurance test of our real world application in a real world setup and with real world workload (high). We are running with numerous Solr instances in a SolrCloud cluster, with numerous collections each having about 25 slices each with 2 shards (one replica for each slice). During the test Solrs lose their ZK connection (probably due to too long GC pause) and reconnect - resulting in more watchers. The next time a dis-/re-connect to ZK happens it gets many watcher-events resulting in even more watchers for the next time. All in all, seen from the outside, this breaks our performance/endurance test - at first things starts to slow down and eventually JVMs break down with OOM errors. This is a self-reinforcing problem, because for every iteration more time has to be used by the garbage collector collecting watchers (twice as many as last time), increasing the probability of new ZK timeouts, and more time has to be used creating new watchers (twice as many as last time).

        I think you should commit the fix. Basically because it makes a (our) real world application able to run for a long time - it wasnt before. Commit the fix, not so much for our sake, because we are using our own build of Solr (inkl this fix, other fixes and nice impl of optimistic locking etc (SOLR-3173, SOLR-3178, etc)) anyway, but to save others (that might also be among the "first movers" on using Solr 4.0 for high scale real world applications) from having to use weeks tracking down the essence of this issue and make a fix.

        If you think this observation/fix should lead to a walk through of the code, to check if watchers are used undesirably at other places, and maybe even come to a more generic fix, I would endorse such a task. But for now I urge you to commit to save others from weeks of debugging. If/when you come to a better or more generic solution, you can always go refactor.

        Regards, Per Steffensen

        Show
        Per Steffensen added a comment - Trym didnt mention it, but this is not only a negligible problem that will never cause any problems in real-world usage. Actually we discovered the problem during one of our performance/endurance test of our real world application in a real world setup and with real world workload (high). We are running with numerous Solr instances in a SolrCloud cluster, with numerous collections each having about 25 slices each with 2 shards (one replica for each slice). During the test Solrs lose their ZK connection (probably due to too long GC pause) and reconnect - resulting in more watchers. The next time a dis-/re-connect to ZK happens it gets many watcher-events resulting in even more watchers for the next time. All in all, seen from the outside, this breaks our performance/endurance test - at first things starts to slow down and eventually JVMs break down with OOM errors. This is a self-reinforcing problem, because for every iteration more time has to be used by the garbage collector collecting watchers (twice as many as last time), increasing the probability of new ZK timeouts, and more time has to be used creating new watchers (twice as many as last time). I think you should commit the fix. Basically because it makes a (our) real world application able to run for a long time - it wasnt before. Commit the fix, not so much for our sake, because we are using our own build of Solr (inkl this fix, other fixes and nice impl of optimistic locking etc ( SOLR-3173 , SOLR-3178 , etc)) anyway, but to save others (that might also be among the "first movers" on using Solr 4.0 for high scale real world applications) from having to use weeks tracking down the essence of this issue and make a fix. If you think this observation/fix should lead to a walk through of the code, to check if watchers are used undesirably at other places, and maybe even come to a more generic fix, I would endorse such a task. But for now I urge you to commit to save others from weeks of debugging. If/when you come to a better or more generic solution, you can always go refactor. Regards, Per Steffensen
        Hide
        Mark Miller added a comment -

        Thanks Trym!

        Show
        Mark Miller added a comment - Thanks Trym!
        Hide
        Hoss Man added a comment -

        hoss20120711-manual-post-40alpha-change

        Show
        Hoss Man added a comment - hoss20120711-manual-post-40alpha-change
        Hide
        Jessica Cheng Mallet added a comment -

        Never mind - found confirmation elsewhere that session events do not remove the watcher. The ZooKeeper programming guide does not appear very clear on this when it talks about watches being one time triggers.

        Mark, is this true even if the event is "Expired"? I don't know how zookeeper works in detail, but I'd be surprised if it would keep watchers across sessions (how can it figure out which session re-establishment is related to the expired session?).

        Show
        Jessica Cheng Mallet added a comment - Never mind - found confirmation elsewhere that session events do not remove the watcher. The ZooKeeper programming guide does not appear very clear on this when it talks about watches being one time triggers. Mark, is this true even if the event is "Expired"? I don't know how zookeeper works in detail, but I'd be surprised if it would keep watchers across sessions (how can it figure out which session re-establishment is related to the expired session?).
        Hide
        Mark Miller added a comment -

        If the session is expired you have to create all new watchers - you also have to create a new ZooKeeper client instance. Watchers do not span sessions.

        Show
        Mark Miller added a comment - If the session is expired you have to create all new watchers - you also have to create a new ZooKeeper client instance. Watchers do not span sessions.
        Hide
        Jessica Cheng Mallet added a comment -

        Thanks Mark, that's what I thought. I think we might need to do some type of isLeader clean-up if we get an Expired here (in the patch)-I'm suspecting this is related to some sort of recovery race/ error I'm seeing-but I'm going to open a new jira with more descriptions. Thanks.

        Show
        Jessica Cheng Mallet added a comment - Thanks Mark, that's what I thought. I think we might need to do some type of isLeader clean-up if we get an Expired here (in the patch)- I'm suspecting this is related to some sort of recovery race/ error I'm seeing -but I'm going to open a new jira with more descriptions. Thanks.
        Hide
        Jessica Cheng Mallet added a comment -

        Sorry, "if we get an Expired here with certain conditions". That was really unclear. I'll just open a new jira with more info...

        Show
        Jessica Cheng Mallet added a comment - Sorry, "if we get an Expired here with certain conditions". That was really unclear. I'll just open a new jira with more info...

          People

          • Assignee:
            Mark Miller
            Reporter:
            Mark Miller
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development