Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-6934

max_active_runs from different dag in dagbag stopping any task from running



    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.10.7
    • Fix Version/s: None
    • Component/s: scheduler
    • Labels:


      I have a one .py that creates multiple dagids (it is a dynamic dag generator, so 25 diff dag ids created, including dagA and dagB). I have max_active_runs_per_dag =5 in .cfg. I then did airflow cli triggerdag for dagA for 7 diff execdates in parallel and triggerdag for dagB for 4 diff execdates in parallel. From looking in the UI the dagA showed red in the schedule column. There were tasks in scheduled & queued state in both dagA and dagB but there were no tasks in running state (even over last 3 hrs!). The scheduler was still up though and running tasks from dagC (which is created from a different .py than the .py that creates dagA and dagB). I noticed this message printed in the scheduler logs frequently: "Number of active dag runs reached max_active_run."

      From tracing the code I think this is what happens:
      _process_file (https://github.com/apache/airflow/blob/1.10.7/airflow/jobs/scheduler_job.py#L1512-L1588) runs at level of .py (so many diff dagids)
      it calls _process_dags
      for each dagid from that .py it calls _process_task_instances
      _process_task_instances has a counter (active_dag_runs) which is appended for each dag being iterated over, it breaks out of the loop (the loop which appends ids to a list) if the counter > max_active_runs_per_dag (from .cfg). I couldn't see where task_instances_list gets used though

      I'm using localexecutor, v1.10.7




            • Assignee:
              toopt4 t oo
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: