Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4419

The potential exception in Learner$LeaderConnector.connectToLeader may cause unnecessary re-election or service delay

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.6.2
    • None
    • server

    Description

      We were doing some testing in ZooKeeper 3.6.2, and found that sometimes a follower may take longer time (about 10s) to join the quorum. In our cluster, usually we expect that the leader and followers are ready to provide service in 1 or 2 seconds.

      Analysis

      We did some investigations and found that the root cause is the exception thrown from `Learner.createSocket` (https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L410-L423).

      `Learner.createSocket` is used in `Learner$LeaderConnector.connectToLeader`.

      The code snippet (https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L344-L407) is as follows:

              private Socket connectToLeader() throws IOException, X509Exception, InterruptedException {
                  Socket sock = createSocket();               // line 345
                  // ...
                  int remainingTimeout;
                  long startNanoTime = nanoTime();
                  for (int tries = 0; tries < 5 && socket.get() == null; tries++) {
                      try {
                          remainingTimeout = connectTimeout - (int) ((nanoTime() - startNanoTime) / 1_000_000);
                          if (remainingTimeout <= 0) {
                              LOG.error(...);
                              throw new IOException("connectToLeader exceeded on retries.");
                          }
                          sockConnect(sock, address, Math.min(connectTimeout, remainingTimeout));
                          // ...
                          break;
                      } catch (IOException e) {
                          remainingTimeout = connectTimeout - (int) ((nanoTime() - startNanoTime) / 1_000_000);
                          if (remainingTimeout <= leaderConnectDelayDuringRetryMs) {
                              LOG.error(...);
                              throw e;
                          } else if (tries >= 4) {
                              LOG.error(...);
                              throw e;
                          } else {
                              LOG.warn(...);
                              sock = createSocket();          // line 400
                          }
                      }
                      Thread.sleep(leaderConnectDelayDuringRetryMs);
                  }
                  return sock;
              }
      

      The retry (5 times) in this code implies that sometimes exceptions are expected to appear, and the program should proceed to try again. However, `Learner.createSocket` is invoked in either line 345 or line 400. And it may also throw exceptions. But it’s not within the try block. Therefore, this exception will turn to `Follower.followLeader` (the caller of `Learner$LeaderConnector.connectToLeader`). And then this server will change from FOLLOWING state to LOOKING state.

      `Learner.createSocket` may throw exceptions in line 417 or 421:

          /**
           * Creating a simple or and SSL socket.
           * This can be overridden in tests to fake already connected sockets for connectToLeader.
           */
          protected Socket createSocket() throws X509Exception, IOException {
              Socket sock;
              if (self.isSslQuorum()) {
                  sock = self.getX509Util().createSSLSocket();    // line 417
              } else {
                  sock = new Socket();
              }
              sock.setSoTimeout(self.tickTime * self.initLimit);  // line 421
              return sock;
          } 

      The reason for throwing the exception may be https://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation or https://stackoverflow.com/questions/7064788/when-does-java-net-socket-setsotimeout-throws-socketexception.

      We reproduced this bug by intentionally injecting an IOException right before the invocation of `Learner.createSocket` in line 345. We only grant one single injection in a testing run of a ZooKeeper cluster (3 nodes). Our analysis proves to be true. In terms of the symptom, sometimes the affected follower takes more time to join the quorum; sometimes the leader election is also affected, and the whole cluster does the re-election, meaning that every node is affected.

      Fix

      We propose that the `Learner.createSocket` should be moved to the try-catch block. We have created a pull request.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              functioner Haoze Wu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m