ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1091

when the chrootPath of ClientCnxn is not null and the Watches of zooKeeper is not null and the method primeConnection(SelectionKey k) of ClientCnxn Occurred again for some reason ,then the wrong watcher clientPath is sended to server

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Duplicate
    • Affects Version/s: 3.3.3
    • Fix Version/s: 3.4.0
    • Component/s: java client
    • Labels:
      None
    • Environment:

      Linux version 2.6.18-194.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Fri Apr 2 14:58:14 EDT 2010

      Description

      if the chrootPath of ClientCnxn is not null and the Watches of zooKeeper is not null; and then for some reason(like zookeeper server stop and start), the zookeeper client will primeConnection to server again and tell server the watcher path,but the path is wrong,it show be serverpath but not clientpath;if the wrong watcher clientPath is sended to server,
      the exception will occurr, the exceptions:

      2011-06-10 04:33:16,935 [pool-2-thread-30-SendThread(DB1-6:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x5302c4403a30232 for server DB1-6/192.168.1.6:2181, unexpected error, closing socket connection and attempting reconnect
      java.lang.StringIndexOutOfBoundsException: String index out of range: -6
      at java.lang.String.substring(String.java:1937)
      at java.lang.String.substring(String.java:1904)
      at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:794)
      at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:881)
      at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130)

        Activity

        Hide
        Daniel Lord added a comment -

        I am running in to this issue now as well. It appears to me as though this is only a problem when running more than one zookeeper client per application instance where they have a chroot path that is multiple znodes deep. Normally that is not a problem since I share the zookeeper client across many threads/services in the application. However, I do have one use case where there is an administrative task that needs to span multiple zookeeper "clusters" that are partitioned by chroot.

        For example I have two clusters of application nodes running under different zookeeper chroots – say "/blah/version1" and "/blah/version2". There are many application nodes in the "version1" cluster and many in the "version2" cluster. These each use chroot to logically partition themselves and it works great. Each application node holds a single zookeeper client.

        I also have an administrative node that is responsible for monitoring both of these clusters. This node has two zookeeper clients one for each chroot'd cluster. This works perfectly so long as I am not disconnected from the ensemble. As soon as I get disconnected I start getting a flood of this StringIndexOutOfBoundsException.

        I can easily cause this to happen by having more than one zookeeper client in a single process where both zookeeper clients are using a chroot path that is multiple levels deep. If I connect to a locally running standalone zookeeper server as soon as I stop and restart the zookeeper server I get this exception. I have no problems with this test if I run only a single zookeeper client or I run with chroot paths that are only a single znode.

        Show
        Daniel Lord added a comment - I am running in to this issue now as well. It appears to me as though this is only a problem when running more than one zookeeper client per application instance where they have a chroot path that is multiple znodes deep. Normally that is not a problem since I share the zookeeper client across many threads/services in the application. However, I do have one use case where there is an administrative task that needs to span multiple zookeeper "clusters" that are partitioned by chroot. For example I have two clusters of application nodes running under different zookeeper chroots – say "/blah/version1" and "/blah/version2". There are many application nodes in the "version1" cluster and many in the "version2" cluster. These each use chroot to logically partition themselves and it works great. Each application node holds a single zookeeper client. I also have an administrative node that is responsible for monitoring both of these clusters. This node has two zookeeper clients one for each chroot'd cluster. This works perfectly so long as I am not disconnected from the ensemble. As soon as I get disconnected I start getting a flood of this StringIndexOutOfBoundsException. I can easily cause this to happen by having more than one zookeeper client in a single process where both zookeeper clients are using a chroot path that is multiple levels deep. If I connect to a locally running standalone zookeeper server as soon as I stop and restart the zookeeper server I get this exception. I have no problems with this test if I run only a single zookeeper client or I run with chroot paths that are only a single znode.
        Hide
        Thomas Koch added a comment -

        This sounds like a duplicate of ZOOKEEPER-961.

        Show
        Thomas Koch added a comment - This sounds like a duplicate of ZOOKEEPER-961 .
        Hide
        Daniel Lord added a comment -

        Ooof been a long day. The fact that there are multiple znodes in the chroot is inconsequential. The StringIndexOutOfBounds exception went away because my chroot string got shorter than the full path returned in the event.

        In any case I believe the rest of my test is valid. If there is a single instance that has multiple zookeeper clients connected to the same ensemble if they are completely disconnected then the paths in events can be messed up.

        Show
        Daniel Lord added a comment - Ooof been a long day. The fact that there are multiple znodes in the chroot is inconsequential. The StringIndexOutOfBounds exception went away because my chroot string got shorter than the full path returned in the event. In any case I believe the rest of my test is valid. If there is a single instance that has multiple zookeeper clients connected to the same ensemble if they are completely disconnected then the paths in events can be messed up.
        Hide
        Matthias Spycher added a comment -

        Thomas is right. One of the unit tests in the patch to ZOOKEEPER-961 verifies that the StringIndexOutOfBounds no longer occurs. Also, if any path from the server is too short for the current chroot, we no longer throw the exception. Instead a warning is issued and the raw path passed in the event.

        Show
        Matthias Spycher added a comment - Thomas is right. One of the unit tests in the patch to ZOOKEEPER-961 verifies that the StringIndexOutOfBounds no longer occurs. Also, if any path from the server is too short for the current chroot, we no longer throw the exception. Instead a warning is issued and the raw path passed in the event.

          People

          • Assignee:
            Unassigned
            Reporter:
            zhangyouming
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h
              1h
              Remaining:
              Remaining Estimate - 1h
              1h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development