Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Data Collection
    • Labels:
      None

      Description

      Adaptor shutdown is currently squirrely and needs to be fixed.

      1. CHUKWA-420-restart.patch
        6 kB
        Ari Rabkin
      2. CHUKWA-420.patch
        15 kB
        Ari Rabkin

        Issue Links

          Activity

          Hide
          Ari Rabkin added a comment -

          We're currently halfway between two different models for adaptor shutdown.

          1) separate shutdown() and hardStop() methods for "graceful" and "abrupt" stops
          2) a single shutdown method with a flag specifying how graceful to be.

          This is on my mind because the generalized buffer support (in CHUKWA-395) needs to know the difference between "stop with the potential for restarting" and "stop for good".

          Thoughts and suggestions?

          Show
          Ari Rabkin added a comment - We're currently halfway between two different models for adaptor shutdown. 1) separate shutdown() and hardStop() methods for "graceful" and "abrupt" stops 2) a single shutdown method with a flag specifying how graceful to be. This is on my mind because the generalized buffer support (in CHUKWA-395 ) needs to know the difference between "stop with the potential for restarting" and "stop for good". Thoughts and suggestions?
          Hide
          Eric Yang added a comment -

          My recommendation is having shutdown(OPTION) function and internally, it calls shutdown() or hardStop().
          The existing adaptors needs to be cleaned up to use shutdown(OPTION).

          Show
          Eric Yang added a comment - My recommendation is having shutdown(OPTION) function and internally, it calls shutdown() or hardStop(). The existing adaptors needs to be cleaned up to use shutdown(OPTION).
          Hide
          Ari Rabkin added a comment -

          I agree that using shutdown(OPTION) makes sense. But which options do we need to support, and when should each be invoked?

          The three I see are:

          1) Stop now, expect to be restarted. Used to in the presence of a transient error.
          2) Stop now, permanently. In response to user request.
          3) Stop once all available data has been sent. In particular, keep tailing the file until all data has been sent.

          Show
          Ari Rabkin added a comment - I agree that using shutdown(OPTION) makes sense. But which options do we need to support, and when should each be invoked? The three I see are: 1) Stop now, expect to be restarted. Used to in the presence of a transient error. 2) Stop now, permanently. In response to user request. 3) Stop once all available data has been sent. In particular, keep tailing the file until all data has been sent.
          Hide
          Eric Yang added a comment -

          I think 3) has a limit of n seconds before forceful shutdown 2) is executed.

          Show
          Eric Yang added a comment - I think 3) has a limit of n seconds before forceful shutdown 2) is executed.
          Hide
          Ari Rabkin added a comment -

          Eric – why the limit? And how do you pick n? Why not keep trying until it's done?

          Show
          Ari Rabkin added a comment - Eric – why the limit? And how do you pick n? Why not keep trying until it's done?
          Hide
          Eric Yang added a comment -

          Adaptor will not finish tailing the file, if collector is down. Hence, graceful should have a timeout value to prevent infinite wait.

          Show
          Eric Yang added a comment - Adaptor will not finish tailing the file, if collector is down. Hence, graceful should have a timeout value to prevent infinite wait.
          Hide
          Ari Rabkin added a comment -

          Infinite wait, where? Presumably either the collector will eventually reboot, or else the agent itself will give up and exit. I think given that we have provision for the agent to bail out if there's no collector, there's no need for a separate adaptor-level version of that.

          Show
          Ari Rabkin added a comment - Infinite wait, where? Presumably either the collector will eventually reboot, or else the agent itself will give up and exit. I think given that we have provision for the agent to bail out if there's no collector, there's no need for a separate adaptor-level version of that.
          Hide
          Eric Yang added a comment -

          You are right. Provisioning is in Agent. I was confused. Thanks for clarifying this.

          Show
          Eric Yang added a comment - You are right. Provisioning is in Agent. I was confused. Thanks for clarifying this.
          Hide
          Ari Rabkin added a comment -

          Fairly modest refactoring. Switch to using shutdown(OPT) everywhere; remove deprecated methods.

          Show
          Ari Rabkin added a comment - Fairly modest refactoring. Switch to using shutdown(OPT) everywhere; remove deprecated methods.
          Hide
          Ari Rabkin added a comment -

          Lacking objections, I'm committing this.

          Show
          Ari Rabkin added a comment - Lacking objections, I'm committing this.
          Hide
          Hudson added a comment -

          Integrated in Chukwa-trunk #248 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/248/)
          . More refactoring
          . Clean up adaptor stop methods.

          Show
          Hudson added a comment - Integrated in Chukwa-trunk #248 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/248/ ) . More refactoring . Clean up adaptor stop methods.
          Hide
          Ari Rabkin added a comment -

          it turns out I also want a "RESTART" option indicating to an adaptor that it's likely to be restarted.

          Show
          Ari Rabkin added a comment - it turns out I also want a "RESTART" option indicating to an adaptor that it's likely to be restarted.
          Hide
          Ari Rabkin added a comment -

          passes unit tests and I will commit tomorrow barring objection. It's a bunch of quite minor refactorings to remove unnecessary dependences on shutdown options – adding default cases – plus adding a new option.

          Show
          Ari Rabkin added a comment - passes unit tests and I will commit tomorrow barring objection. It's a bunch of quite minor refactorings to remove unnecessary dependences on shutdown options – adding default cases – plus adding a new option.
          Hide
          Eric Yang added a comment -

          +1 looks good.

          Show
          Eric Yang added a comment - +1 looks good.
          Hide
          Ari Rabkin added a comment -

          I just committed this, adding the reset flag to shutdown.

          Show
          Ari Rabkin added a comment - I just committed this, adding the reset flag to shutdown.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Chukwa-trunk #330 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/330/ )

            People

            • Assignee:
              Ari Rabkin
              Reporter:
              Ari Rabkin
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development