Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-1692

Make GobblinHelixJobScheduler stop Helix workflow asynchronously

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • gobblin-cluster
    • None

    Description

      When handleUpdateJobConfigArrival, a new job config gets posted, GobblinHelixJobScheduler will firstly stop and delete the old job, and try to spin up the updated helix workflow.
      The job scheduler will try to do the stop synchronically with a default 10 seconds timeout setting. However, this stop constantly running longer than the timeout for Helix, causing the job state not correctly updated as stopped. Thus, when construct the GobblinHelixJobLauncher, we will have the previous job in a wrong state as jobRunningMap is not updated yet, causing the new job won’t being launched. So we always see this log: Job {} will not be executed because other jobs are still running

      We can make the job delete asynchronized, and let waitForJobCompletion method to ensure the job status get updated correctly eventually.

      Attachments

        Activity

          People

            hutran Hung Tran
            hanghangliu Hanghang Liu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 20m
                20m