Uploaded image for project: 'Sling'
  1. Sling
  2. SLING-5965

Metrics and a Health-Check for Scheduler to detect long-running Quartz-Jobs

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • Commons Scheduler 2.5.0
    • Commons Scheduler 2.7.0
    • Commons
    • None

    Description

      Sling Scheduler jobs (aka Quartz-Jobs) should typically be fast running jobs. They are served from a thread-pool and should occupy that thread only for a short amount of time.

      If there are 'misbehaving' quartz-jobs that run for a very long time, they start to occupy threads from that thread-pool, thus have an influence on the performance of other scheduled/quartz-jobs.

      We should have metrics (using sling.commons.metrics) that provide information about internas of Sling Scheduler, such as average, max etc duration of scheduled jobs, as well as how many jobs are currently running and since when was the oldest job running.

      Based on this, a Health-Check can monitor the 'oldest job running' metric and flag critical when eg the oldest job is older than 60'000ms (configurable, default).

      Attachments

        1. SLING-5965.patch
          18 kB
          Stefan Egli
        2. SLING-5965.v2.patch.txt
          10 kB
          Stefan Egli
        3. SLING-5965.v3.patch.txt
          34 kB
          Stefan Egli
        4. SchedulerHealthCheck.jpg
          91 kB
          Stefan Egli
        5. oldestRunningJob.jpg
          102 kB
          Stefan Egli
        6. numRunningJobs.jpg
          59 kB
          Stefan Egli
        7. timers.jpg
          101 kB
          Stefan Egli
        8. SLING-5965.v4.patch.txt
          35 kB
          Stefan Egli
        9. SLING-5965.v5.patch.txt
          56 kB
          Stefan Egli
        10. patch.txt
          20 kB
          Carsten Ziegeler

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stefanegli Stefan Egli
            stefanegli Stefan Egli
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment