Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-1879

Supervisor may not shut down workers cleanly

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.1
    • Fix Version/s: 2.0.0, 1.0.2, 1.1.0
    • Component/s: storm-core
    • Labels:
      None

      Description

      We've run into a strange issue with a zombie worker process. It looks like the worker pid file somehow got deleted without the worker process shutting down. This causes the supervisor to try repeatedly to kill the worker unsuccessfully, and means multiple workers may be assigned to the same port. The worker root folder sticks around because the worker is still heartbeating to it.

      It may or may not be related that we've seen Nimbus occasionally enter an infinite loop of printing logs similar to the below.

      2016-05-19 14:55:14.196 o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
      2016-05-19 14:55:14.210 o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
      2016-05-19 14:55:14.218 o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
      2016-05-19 14:55:14.256 o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
      2016-05-19 14:55:14.273 o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
      2016-05-19 14:55:14.316 o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
      

      Which continues until Nimbus is rebooted. We also see repeating blocks similar to the logs below.

      2016-06-02 07:45:03.656 o.a.s.d.nimbus [INFO] Cleaning up ZendeskTicketTopology-127-1464780171
      2016-06-02 07:45:04.132 o.a.s.d.nimbus [INFO] ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormjar.jar)
      2016-06-02 07:45:04.144 o.a.s.d.nimbus [INFO] ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormconf.ser)
      2016-06-02 07:45:04.155 o.a.s.d.nimbus [INFO] ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormcode.ser)
      

        Attachments

        1. nimbus-supervisor.zip
          6.42 MB
          Stig Rohde Døssing
        2. supervisor.log
          1.10 MB
          Nico Meyer
        3. fix_missing_worker_pid.patch
          1 kB
          Nico Meyer

          Issue Links

            Activity

              People

              • Assignee:
                kabhwan Jungtaek Lim
                Reporter:
                srdo Stig Rohde Døssing
              • Votes:
                5 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: