OFBiz
  1. OFBiz
  2. OFBIZ-3583

Resolve two issues with scheduled jobs related to clean-up

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: framework
    • Labels:
      None
    • Sprint:
      Bug Crush Event - 21/2/2015

      Description

      Encountered two problems –

      1) If a semaphore service is executing when the application server goes down (see purgeOldJobs) the reloadCrashedJobs takes over to mark this job as CRASHED. However, it does not clean-up the ServiceSemaphore record which causes all future jobs to either fail immediately or wait (until semaphore timeout) and then fail.

      2) When ServiceUtil.purgeOldJobs is invoked it blindly attempts to delete runtimeData and then rollsback if this delete fails (always when other jobs reference the same runtimeData). This causes a service error log message for what is really typical behavior.

      Solutions –

      1) When reloading crashed jobs, we look for a rogue ServiceSemaphore for this service name and purge it (on start-up). This works for multiple application servers because any crashed job would leave behind the semaphore and no other application server running the JobManager could have created it (as they would have been blocked from executing).

      2) In purgeOldJobs I changed the list of runtimeDataIds from a List to a Set (this remove the redundant delete requests). When attempting the delete I do a "quick" count on the JobSandbox table to see if there are any jobs still hanging onto the particular RuntimeData instance and only attempt the delete when there are no more remaining jobs. There is an existing index on the JobSandbox for the runtimeDataId so this count should perform relatively quickly.

        Activity

        Hide
        Jacques Le Roux added a comment - - edited

        Sorry my question was not precise enough. Do you mean recently, after the overhaul done in this area? I mean in the trunk and R14.12.

        Because I also encoutered such issues, with R11.04 but did not have an opportunity to reproduce yet.

        Show
        Jacques Le Roux added a comment - - edited Sorry my question was not precise enough. Do you mean recently, after the overhaul done in this area? I mean in the trunk and R14.12. Because I also encoutered such issues, with R11.04 but did not have an opportunity to reproduce yet.
        Hide
        Nicolas Malin added a comment -

        I confirm, I already had some error with semaphore and jobsandbox but without check the problem !

        Show
        Nicolas Malin added a comment - I confirm, I already had some error with semaphore and jobsandbox but without check the problem !
        Hide
        Jacques Le Roux added a comment -

        This patch no longer applies, but I wonder if, after the changes in this area, the ideas in this patch are still relevant.

        Show
        Jacques Le Roux added a comment - This patch no longer applies, but I wonder if, after the changes in this area, the ideas in this patch are still relevant.

          People

          • Assignee:
            Nicolas Malin
            Reporter:
            Bob Morley
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development

                Agile