Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-251

NullPointerException stopping and starting Zookeeper servers

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 3.0.1
    • Fix Version/s: 3.1.0
    • Component/s: server
    • Labels:
      None
    • Environment:

      Tested with JDK 1.5, Solaris, but I suspect it is not relevant in this case.

    • Hadoop Flags:
      Reviewed

      Description

      See the following thread for the original report:
      http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/200812.mbox/browser
      Steps to reproduce:
      1) Start a replicated zookeeper service consisting of 3 zookeeper (3.0.1) servers all running on the same host (of course, all using their own ports and log directories)
      2) Create one znode in this ensemble (using the zookeeper client console, I issued 'create /node1 node1data').
      3) Stop, then restart a single zookeeper server; moving onto the next one a few seconds later.
      4) Go back to 3. After 4-5 iterations, the following should occur, with the failing server exiting:
      java.lang.NullPointerException
      at
      org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:447)
      at
      org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:358)
      at
      org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:333)
      at
      org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:250)
      at
      org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:102)
      at
      org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:183)
      at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:245)
      at
      org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:421)
      2008-12-08 14:14:24,880 - INFO
      [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Leader@336] - Shutdown called
      java.lang.Exception: shutdown Leader! reason: Forcing shutdown
      at
      org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:336)
      at
      org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)
      Exception in thread "QuorumPeer:/0:0:0:0:0:0:0:0:2183"
      java.lang.NullPointerException
      at
      org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:339)
      at
      org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:427)

      The inputStream field is null, apparently because next is being called
      at line 358 even after next returns false. Having very little knowledge
      about the implementation, I don't know if the existence of hdr.getZxid()
      >= zxid is supposed to be an invariant across all invocations of the
      server; however the following change to FileTxnLog.java seems to make
      the problem go away.
      diff FileTxnLog.java /tmp/FileTxnLog.java
      358c358,359
      < next();

      > if (!next())
      > return;
      447c448,450
      < inputStream.close();

      > if (inputStream != null)

      { > inputStream.close(); > }

        Attachments

        1. ZOOKEEPER-251.patch
          5 kB
          Mahadev Konar
        2. ZOOKEEPER-251.patch
          0.9 kB
          Mahadev Konar

          Activity

            People

            • Assignee:
              mahadev Mahadev Konar
              Reporter:
              vinodjohnson Thomas Vinod Johnson
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: