Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-1612

Zookeeper unable to recover and start once datadir disk is full and disk space cleared

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.4.3
    • None
    • None
    • None

    Description

      Once zookeeper data dir disk becomes full, the process gets shut down.

      2012-12-14 13:22:26,959 [myid:2] - ERROR [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@276] - Severe unrecoverable error, exiting
      java.io.IOException: No space left on device
      	at java.io.FileOutputStream.writeBytes(Native Method)
      	at java.io.FileOutputStream.write(FileOutputStream.java:282)
      	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
      	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
      	at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:56)
      	at java.io.DataOutputStream.write(DataOutputStream.java:90)
      	at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
      	at org.apache.jute.BinaryOutputArchive.writeBuffer(BinaryOutputArchive.java:119)
      	at org.apache.zookeeper.server.DataNode.serialize(DataNode.java:168)
      	at org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
      	at org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1115)
      	at org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1130)
      	at org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1130)
      	at org.apache.zookeeper.server.DataTree.serialize(DataTree.java:1179)
      	at org.apache.zookeeper.server.util.SerializeUtils.serializeSnapshot(SerializeUtils.java:138)
      	at org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:213)
      	at org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:230)
      	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:242)
      	at org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:274)
      	at org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:407)
      	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:82)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:759)
      

      Later disk space is cleared and zk started again. Startup of zk fails as it is not able to read snapshot properly. (Since load from disk failed it is not able to join peers in the quorum and get a snapshot diff)

      2012-12-14 16:20:31,489 [myid:2] - INFO  [main:FileSnap@83] - Reading snapshot ../dataDir/version-2/snapshot.1000000042
      2012-12-14 16:20:31,564 [myid:2] - ERROR [main:QuorumPeer@472] - Unable to load database on disk
      java.io.EOFException
      	at java.io.DataInputStream.readInt(DataInputStream.java:375)
      	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
      	at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
      	at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
      	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
      	at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:436)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:428)
      	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:152)
      	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
      	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
      2012-12-14 16:20:31,566 [myid:2] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
      java.lang.RuntimeException: Unable to run quorum server 
      	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:473)
      	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:428)
      	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:152)
      	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
      	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
      Caused by: java.io.EOFException
      	at java.io.DataInputStream.readInt(DataInputStream.java:375)
      	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
      	at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
      	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
      	at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
      	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
      
       

      Attachments

        Activity

          People

            Unassigned Unassigned
            suja suja s
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: