Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.1
    • Fix Version/s: 3.2.1, 3.3.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      ZOOKEEPER-508 includes the fix.

      Description

      here are the part of the log whereby my zookeeper instance crashed, taking 3 out of 5 down, and thus ruining the quorum for all clients:

      2009-07-23 12:29:06,769 WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session 0x52276d1d5161350 due to java.io.IOException: Read error
      2009-07-23 12:29:00,756 WARN org.apache.zookeeper.server.quorum.Follower: Exception when following the leader
      java.io.EOFException
      at java.io.DataInputStream.readInt(DataInputStream.java:375)
      at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
      at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:65)
      at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
      at org.apache.zookeeper.server.quorum.Follower.readPacket(Follower.java:114)
      at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:243)
      at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:494)
      2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5161350 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.168:39489]
      2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb0578 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.159:46797]
      2009-07-23 12:29:06,771 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa013e NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.153:33998]
      2009-07-23 12:29:06,771 WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session 0x52276d1d5160593 due to java.io.IOException: Read error
      2009-07-23 12:29:06,808 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e02bb NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.158:53758]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa13e4 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.154:58681]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691382 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59967]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb1354 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.163:49957]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa13cd NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.150:34212]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691383 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.159:46813]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb0350 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59956]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e139b NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.156:55138]
      2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e1398 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.167:41257]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5161355 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.153:34032]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d516011c NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.155:56314]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e056b NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.155:56322]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d516011f NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.157:49618]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e11ea NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.10.20.42:55483]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e02ba NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.157:49632]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb1355 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.169:58824]
      2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691378 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.161:40973]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691380 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59944]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e0311 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.160:56167]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e690374 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.169:58815]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e139f NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.151:51396]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e139c NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.155:56315]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e69137b NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59859]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5160594 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.151:51370]
      2009-07-23 12:29:06,811 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e69137a NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.159:46682]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5160347 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.165:35722]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e69137f NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.159:46754]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5160121 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.155:56307]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb0126 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.154:58688]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa05fc NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.152:45067]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e0316 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.169:58800]
      2009-07-23 12:29:06,812 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e69137e NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.159:46737]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e69137d NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.159:46733]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa13df NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.156:55137]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb134e NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.166:40443]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691381 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.161:41086]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5161356 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.165:35719]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb1349 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.
      20.20.158:53770]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x12276d15dfb0352 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.165:35718]
      2009-07-23 12:29:06,813 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691379 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59823]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d516000e NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.150:34216]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x32276d15d2e1397 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.169:58829]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e69137c NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59862]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa0140 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.155:56271]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x42276d1d3fa13e1 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.157:49608]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x22276d15e691377 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.162:59789]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x52276d1d5160593 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.165:35703]
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.FinalRequestProcessor: shutdown of request processor complete
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.quorum.FollowerRequestProcessor: FollowerRequestProcessor exited loop!
      2009-07-23 12:29:06,814 INFO org.apache.zookeeper.server.quorum.CommitProcessor: CommitProcessor exited loop!
      2009-07-23 12:29:06,815 INFO org.apache.zookeeper.server.quorum.Follower: shutdown called
      java.lang.Exception: shutdown Follower
      at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:427)
      at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:498)
      2009-07-23 12:29:06,815 WARN org.apache.zookeeper.server.NIOServerCnxn: Ignoring exception
      java.nio.channels.CancelledKeyException
      at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
      at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:69)
      at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:201)
      2009-07-23 12:29:06,815 INFO org.apache.zookeeper.server.quorum.QuorumPeer: LOOKING
      2009-07-23 12:29:06,817 WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      2009-07-23 12:29:06,817 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.156:55206]
      2009-07-23 12:29:06,818 WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
      2009-07-23 12:29:06,818 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.155:56331]
      [elided lots of the same]
      2009-07-23 12:29:33,008 INFO org.apache.zookeeper.server.NIOServerCnxn: closing session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 remote=/10.20.20.152:5945
      8]
      2009-07-23 12:29:33,011 FATAL org.apache.zookeeper.server.SyncRequestProcessor: Severe unrecoverable error, exiting
      java.net.SocketException: Socket closed
      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
      at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
      at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
      at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
      at org.apache.zookeeper.server.quorum.Follower.writePacket(Follower.java:100)
      at org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:52)
      at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:131)
      at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:76)

      The good news is when I restarted the downed zookeepers, everything returned to normal.

      1. QuorumTest.log
        533 kB
        Mahadev konar
      2. QuorumTest.log.gz
        3.16 MB
        Mahadev konar
      3. zklogs.tar.gz
        4.44 MB
        ryan rawson
      4. ZOOKEEPER-483.patch
        9 kB
        Benjamin Reed
      5. ZOOKEEPER-483.patch
        4 kB
        Benjamin Reed
      6. ZOOKEEPER-483.patch
        4 kB
        Benjamin Reed
      7. ZOOKEEPER-483.patch
        4 kB
        Benjamin Reed

        Issue Links

          Activity

          ryan rawson created issue -
          Patrick Hunt made changes -
          Field Original Value New Value
          Link This issue is related to ZOOKEEPER-485 [ ZOOKEEPER-485 ]
          ryan rawson made changes -
          Attachment zklogs.tar.gz [ 12414385 ]
          Mahadev konar made changes -
          Fix Version/s 3.2.1 [ 12314068 ]
          Benjamin Reed made changes -
          Attachment ZOOKEEPER-483.patch [ 12414405 ]
          Mahadev konar made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 3.3.0 [ 12313976 ]
          Patrick Hunt made changes -
          Assignee Benjamin Reed [ breed ]
          Patrick Hunt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Benjamin Reed made changes -
          Attachment ZOOKEEPER-483.patch [ 12415695 ]
          Benjamin Reed made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Benjamin Reed made changes -
          Attachment ZOOKEEPER-483.patch [ 12415974 ]
          Benjamin Reed made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Benjamin Reed made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Patrick Hunt made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Patrick Hunt made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Mahadev konar made changes -
          Attachment QuorumTest.log [ 12416251 ]
          Mahadev konar made changes -
          Attachment QuorumTest.log.gz [ 12416352 ]
          Benjamin Reed made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Benjamin Reed made changes -
          Attachment ZOOKEEPER-483.patch [ 12416645 ]
          Benjamin Reed made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Mahadev konar made changes -
          Link This issue relates to ZOOKEEPER-508 [ ZOOKEEPER-508 ]
          Mahadev konar made changes -
          Link This issue incorporates ZOOKEEPER-509 [ ZOOKEEPER-509 ]
          Mahadev konar made changes -
          Link This issue relates to ZOOKEEPER-508 [ ZOOKEEPER-508 ]
          Mahadev konar made changes -
          Link This issue is part of ZOOKEEPER-508 [ ZOOKEEPER-508 ]
          Mahadev konar made changes -
          Link This issue is blocked by ZOOKEEPER-508 [ ZOOKEEPER-508 ]
          Mahadev konar made changes -
          Link This issue is blocked by ZOOKEEPER-508 [ ZOOKEEPER-508 ]
          Mahadev konar made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Mahadev konar made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Release Note ZOOKEEPER-508 includes the fix.
          Resolution Fixed [ 1 ]
          Patrick Hunt made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Benjamin Reed
              Reporter:
              ryan rawson
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development