Bookkeeper
  1. Bookkeeper
  2. BOOKKEEPER-242

Bookkeeper not able to connect other zookeeper when shutdown the zookeeper server where the BK has connected.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 4.0.0
    • Fix Version/s: 4.1.0
    • Component/s: bookkeeper-server
    • Labels:
      None

      Description

      Scenario :
      1. Start three zookeeper cluster.
      2. start three bookKeeper.
      3. shutdown zookeeper server where bookkeeper has connected.

      Expected Result:
      bookkeeper should be able to connect other zookeeper

      Actual Result :
      All three bookkeepers retry to connect zookeeper node which has been shutdown.

      BookKeeper log for Retry :

       
      2012-04-25 13:35:15,319 - WARN  [main-EventThread:Bookie$2@456] - ZK client has been disconnected to the ZK server!
      2012-04-25 13:35:17,194 - INFO  [main-SendThread(HOST-10-18-40-91:2181):ClientCnxn$SendThread@933] - Opening socket connection to server HOST-10-18-40-91/10.18.40.91:2181
      2012-04-25 13:35:17,196 - WARN  [main-SendThread(HOST-10-18-40-91:2181):ClientCnxn$SendThread@1063] - Session 0x136e87b50ce0002 for server null, unexpected error, closing socket connection and attempting reconnect
      java.net.ConnectException: Connection refused
      	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
      	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:264)
      	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1041)
      2012-04-25 13:35:19,125 - INFO  [main-SendThread(HOST-10-18-40-91:2181):ClientCnxn$SendThread@933] - Opening socket connection to server HOST-10-18-40-91/10.18.40.91:2181
      2012-04-25 13:35:19,126 - WARN  [main-SendThread(HOST-10-18-40-91:2181):ClientCnxn$SendThread@1063] - Session 0x136e87b50ce0002 for server null, unexpected error, closing socket connection and attempting reconnect
      java.net.ConnectException: Connection refused
      	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
      	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:264)
      	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1041)
      2012-04-25 13:35:20,276 - INFO  [main-SendThread(HOST-10-18-40-91:2181):ClientCnxn$SendThread@933] - Opening socket connection to server HOST-10-18-40-91/10.18.40.91:2181
      2012-04-25 13:35:20,277 - WARN  [main-SendThread(HOST-10-18-40-91:2181):ClientCnxn$SendThread@1063] - Session 0x136e87b50ce0002 for server null, unexpected error, closing socket connection and attempting reconnect
      java.net.ConnectException: Connection refused
      	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
      	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
      	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:264)
      	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1041)
      
      
      1. BK-242.diff_v2
        3 kB
        Sijie Guo
      2. BK-242.diff
        3 kB
        Sijie Guo

        Activity

        Hide
        Hudson added a comment -

        Integrated in bookkeeper-trunk #497 (See https://builds.apache.org/job/bookkeeper-trunk/497/)
        BOOKKEEPER-242: Bookkeeper not able to connect other zookeeper when shutdown the zookeeper server where the BK has connected. (sijie & rakeshr via ivank) (Revision 1335973)

        Result = SUCCESS
        ivank :
        Files :

        • /zookeeper/bookkeeper/trunk/CHANGES.txt
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ClientConfiguration.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java
        • /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/ConfigurationTest.java
        Show
        Hudson added a comment - Integrated in bookkeeper-trunk #497 (See https://builds.apache.org/job/bookkeeper-trunk/497/ ) BOOKKEEPER-242 : Bookkeeper not able to connect other zookeeper when shutdown the zookeeper server where the BK has connected. (sijie & rakeshr via ivank) (Revision 1335973) Result = SUCCESS ivank : Files : /zookeeper/bookkeeper/trunk/CHANGES.txt /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ClientConfiguration.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/conf/ServerConfiguration.java /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/ConfigurationTest.java
        Hide
        Ivan Kelly added a comment -

        Committed as r1335973. Thanks Rakesh & Sijie.

        Show
        Ivan Kelly added a comment - Committed as r1335973. Thanks Rakesh & Sijie.
        Hide
        Sijie Guo added a comment -

        attach a new patch addressed Ivan and Rakesh's suggestion. thanks all.

        Show
        Sijie Guo added a comment - attach a new patch addressed Ivan and Rakesh's suggestion. thanks all.
        Hide
        Rakesh R added a comment -

        Yeah Ivan, StringUtils.join(servers, ',') would do the stuff and pretty simple.

        Also, its good practise(improve readability) to provide message in test assertions like, assertEquals( message, expected, actual)

        Show
        Rakesh R added a comment - Yeah Ivan, StringUtils.join(servers, ',') would do the stuff and pretty simple. Also, its good practise(improve readability) to provide message in test assertions like, assertEquals( message, expected, actual)
        Hide
        Ivan Kelly added a comment -

        Ah, i hadn't spotted there was already a patch. Could you change it to use StringUtils#join. I'll push it in then.

        Show
        Ivan Kelly added a comment - Ah, i hadn't spotted there was already a patch. Could you change it to use StringUtils#join. I'll push it in then.
        Hide
        Ivan Kelly added a comment -

        Making this a blocker as it seriously affects availability.

        Show
        Ivan Kelly added a comment - Making this a blocker as it seriously affects availability.
        Hide
        Ivan Kelly added a comment -

        This is very similar to BOOKKEEPER-171[1].

        I suggest we use the first Rakesh is suggesting, but instead of manually doing the string stuff, use StringUtils#join [2]. Internally the client configuration should store the list as a String[]

        [1] https://github.com/apache/bookkeeper/commit/3defe9e5110472da995789b5ae14c286f58f1293
        [2] http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/StringUtils.html

        Show
        Ivan Kelly added a comment - This is very similar to BOOKKEEPER-171 [1] . I suggest we use the first Rakesh is suggesting, but instead of manually doing the string stuff, use StringUtils#join [2] . Internally the client configuration should store the list as a String[] [1] https://github.com/apache/bookkeeper/commit/3defe9e5110472da995789b5ae14c286f58f1293 [2] http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/StringUtils.html
        Hide
        Rakesh R added a comment -

        Hi Sijie, I have few suggestions:

        comment-1) getStringArray() internally using getList() and preparing array of string. Instead can we directly use getList()?

        comment-2)

        +            sb.append(servers[i]);
        +            if (i < servers.length - 1) {
        +                sb.append(",");
        +            }
        

        instead of this can we use like,

          if( sb.length() > 0 ){
             sb.append(",");
          }
          sb.append(servers[i]);
        

        -Rakesh

        Show
        Rakesh R added a comment - Hi Sijie, I have few suggestions: comment-1) getStringArray() internally using getList() and preparing array of string. Instead can we directly use getList()? comment-2) + sb.append(servers[i]); + if (i < servers.length - 1) { + sb.append(","); + } instead of this can we use like, if( sb.length() > 0 ){ sb.append(","); } sb.append(servers[i]); -Rakesh
        Hide
        Sijie Guo added a comment -

        attach a patch to fix getZkServers issue. thanks Rakesh for help.

        Show
        Sijie Guo added a comment - attach a patch to fix getZkServers issue. thanks Rakesh for help.
        Hide
        Sijie Guo added a comment -

        @Rakesh

        Got it. Thanks for pointing out this issue.

        Show
        Sijie Guo added a comment - @Rakesh Got it. Thanks for pointing out this issue.
        Hide
        surendra singh lilhore added a comment -

        Thanks for looking into this.

        @Sijie
        I have 3 ZK servers and configured like:
        zkServers=10.18.40.155:2181,10.18.40.155:2182,10.18.40.155:2183

        -Surendra

        Show
        surendra singh lilhore added a comment - Thanks for looking into this. @Sijie I have 3 ZK servers and configured like: zkServers=10.18.40.155:2181,10.18.40.155:2182,10.18.40.155:2183 -Surendra
        Hide
        Rakesh R added a comment -

        @Sijie
        getString(ZK_SERVERS, null); is returning only the first server ip:port and ignoring values after the comma

        Say, configured as zkServers=zk1:2181,zk2:2181,zk3:2181
        In ServerConfiguration, it is using getString(ZK_SERVERS, null) and returning only the first zk1:2181 and the same is passing to the ZK client. Here it is missing the other two 'zk2:2181,zk3:2181' values.

        Show
        Rakesh R added a comment - @Sijie getString(ZK_SERVERS, null); is returning only the first server ip:port and ignoring values after the comma Say, configured as zkServers=zk1:2181,zk2:2181,zk3:2181 In ServerConfiguration, it is using getString(ZK_SERVERS, null) and returning only the first zk1:2181 and the same is passing to the ZK client. Here it is missing the other two 'zk2:2181,zk3:2181' values.
        Hide
        Sijie Guo added a comment -

        @Rakesh

        I don't think we need to parse the list of comma separated values. because, ZooKeeper client would handle it when constructing a ZooKeeper object.

        http://zookeeper.apache.org/doc/r3.4.3/api/org/apache/zookeeper/ZooKeeper.html#ZooKeeper(java.lang.String, int, org.apache.zookeeper.Watcher)

        @surendra

        could you tell more about your configuration about ZK_SERVERS ?

        Show
        Sijie Guo added a comment - @Rakesh I don't think we need to parse the list of comma separated values. because, ZooKeeper client would handle it when constructing a ZooKeeper object. http://zookeeper.apache.org/doc/r3.4.3/api/org/apache/zookeeper/ZooKeeper.html#ZooKeeper(java.lang.String , int, org.apache.zookeeper.Watcher) @surendra could you tell more about your configuration about ZK_SERVERS ?
        Hide
        Rakesh R added a comment -

        ServerConfiguration is a type of 'org.apache.commons.configuration.CompositeConfiguration' and for parsing the list of comma separated values, we need to use config.getStringArray() or config.getList().

        But it is using getString() and I think that is the reason not retrying to the cluster when the connected ZK is down. From the logs, its trying only HOST-10-18-40-91/10.18.40.91:2181

            public String getZkServers() {
                return getString(ZK_SERVERS, null);
            }
        

        -Rakesh

        Show
        Rakesh R added a comment - ServerConfiguration is a type of 'org.apache.commons.configuration.CompositeConfiguration' and for parsing the list of comma separated values, we need to use config.getStringArray() or config.getList(). But it is using getString() and I think that is the reason not retrying to the cluster when the connected ZK is down. From the logs, its trying only HOST-10-18-40-91/10.18.40.91:2181 public String getZkServers() { return getString(ZK_SERVERS, null); } -Rakesh

          People

          • Assignee:
            Sijie Guo
            Reporter:
            surendra singh lilhore
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development