Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-2496

ShutdownHook preventing JVM from exiting after SIGTERM

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 4.7.0
    • None
    • None

    Description

      Carter Shanklin pointed out to me that he got into a case where sending a SIGTERM to the Phoenix QueryServer resulted in it not exiting. I've been able to reproduce this.

      1. Start HBase and PQS
      2. Stop HBase master
      3. Try to run a query through PQS
      4. kill -15 <pqs_pid>

      At this point, the thread from #3 is still running in PQS, trying to connect to HBase (following the normal HBase retry policy which will retry for order-minutes). The ShutdownHook, run as an attempt to cleanup nicely, gets blocked trying to close the instance because the read lock is still held by the step 3 query. The outward effect is that PQS stays up and running until HBase becomes available or the HBase retries time out because the JVM will stay running until all shutdown hooks return.

      While the system will eventually fix itself, it's a bit awkward to send SIGTERM to a process and not have it die within a few seconds. The code around the shutdown hook registration certainly seems like blocking is unintentional too.

      A simple fix is to wrap the PhoenixDriver closing in a timeout so that we don't rely on the HBase timeout to exit the JVM.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            elserj Josh Elser
            elserj Josh Elser
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment