Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2447

Zookeeper adds good delay when one of the quorum host is not reachable

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 3.4.9
    • 3.4.9
    • None
    • None

    Description

      StaticHostProvider --> resolveAndShuffle method adds all of the address which are valid in the quorum to the list, shuffles them and sends back to client connection class. If after shuffling if first node appear to be the one which is not reachable, Clientcnx.SendThread.run will keep on connecting to the failure till a timeout and the moves to a different node. This adds up random delay in zookeeper connection in case a host is down. Rather we could check if host is reachable in StaticHostProvider and ignore isReachable is false. Same as we do for UnknownHostException Exception.

      This can tested using following test code by providing a valid host which is not reachable. for quick test comment Collections.shuffle(tmpList, sourceOfRandomness); in StaticHostProvider.resolveAndShuffle

       @Test
        public void test() throws Exception {
          EventsWatcher watcher = new EventsWatcher();
          QuorumUtil qu = new QuorumUtil(1);
          qu.startAll();
          
          ZooKeeper zk =
              new ZooKeeper("<hostnamet:2181," + qu.getConnString(), 180 * 1000, watcher);
          
          watcher.waitForConnected(CONNECTION_TIMEOUT * 5);
          Assert.assertTrue("connection Established", watcher.isConnected());
          zk.close();    
        }
      

      Following fix can be added to StaticHostProvider.resolveAndShuffle

       if(taddr.isReachable(4000 // can be some value)) {
                            tmpList.add(new InetSocketAddress(taddr, address.getPort()));
                          } 
      

      Attachments

        1. ZOOKEEPER-2447-MinConnectTimeoutOnly.patch
          2 kB
          Dan Benediktson
        2. ZOOKEEPER-2447.branch-3.4.01.patch
          8 kB
          Vishal Khandelwal
        3. ZOOKEEPER-2447.branch-3.4.00.patch
          8 kB
          Vishal Khandelwal
        4. ZOOKEEPER-2447.3.5.patch
          2 kB
          Vishal Khandelwal
        5. withoutFix.txt
          53 kB
          Vishal Khandelwal
        6. withfix.txt
          52 kB
          Vishal Khandelwal

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vishk Vishal Khandelwal
            vishk Vishal Khandelwal

            Dates

              Created:
              Updated:

              Slack

                Issue deployment