Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-1109

Zookeeper service is down when SyncRequestProcessor meets any exception.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 3.3.0, 3.3.1, 3.3.2, 3.3.3
    • 3.4.0
    • quorum
    • None
    • Reviewed
    • quorum, leader, disk full, shutdown

    Description

      Problem Zookeeper is not shut down completely when dataDir disk space is full and ZK Cluster went into unserviceable state.

      Scenario
      If the leader zookeeper disk is made full, the zookeeper is trying to shutdown. But it is waiting indefinitely while shutting down the SyncRequestProcessor thread.

      Root Cause
      this.join() is invoked in the same thread where System.exit(11) has been triggered.

      When disk space full happens, It got the exception as follows 'No space left on device' and invoked System.exit(11) from the SyncRequestProcessor thread(The following logs shows the same). Before exiting JVM, ZK will execute the ShutdownHook of QuorumPeerMain and the flow comes to SyncRequestProcessor.shutdown(). Here this.join() is invoked in the same thread where System.exit(11) has been invoked.

      Attachments

        1. ZOOKEEPER-1109.patch
          1 kB
          Laxman
        2. ZOOKEEPER-1109.1.patch
          1 kB
          Laxman

        Issue Links

          Activity

            People

              lakshman Laxman
              lakshman Laxman
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 72h
                  72h
                  Remaining:
                  Remaining Estimate - 72h
                  72h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified