Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-921 1.8.0rc Issues
  3. AIRFLOW-931

LocalExecutor fails to run queued task with race condition

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.0, 1.8.0rc4
    • Fix Version/s: 1.8.0
    • Component/s: None
    • Labels:
      None

      Description

      https://gist.github.com/vijaykramesh/707262c83429ab2a3f5ee701879813e3 provides a small example that consistently hits this problem with LocalExecutor.

      Basically when the dag run kicks off (with concurrency > 1) and a LocalExecutor with parallelism > 2 the scheduler marks more than concurrency tasks as queued (https://github.com/apache/incubator-airflow/blob/master/airflow/jobs.py#L1095)

      There is a second check before actually running the task (https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1291) that leaves the task in the QUEUED state but then the scheduler never picks it back up. This causes the DAG to get stuck (as the queued tasks never run) until the scheduler is restarted (at which point the enqueued tasks are considered orphaned, the status is set to NONE, and then they are picked up by the scheduler again and run.

        Attachments

          Activity

            People

            • Assignee:
              bolke Bolke de Bruin
              Reporter:
              v_krishna Vijay Krishna Ramesh
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: