Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-47

ExternalTaskSensor causes scheduling dead lock

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Abandoned
    • Affects Version/s: 1.7.0
    • Fix Version/s: None
    • Component/s: operators, scheduler
    • Labels:
      None
    • Environment:
      CentOS 6.5
      Airflow 1.7.0 with SequentialExecuter

      Description

      We are trying to use 'ExternalTaskSensor' to coordinate between a daily DAG and an hourly DAG (daily dags depend on hourly).
      Relevant code:

      Daily DAG definition:

      2_daily_dag.py
      default_args = {
          …
          'start_date': datetime(2016, 4, 2),
          …
      }
      dag = DAG(dag_id='2_daily_agg', default_args=default_args, schedule_interval="@daily")
      
      ext_dep = ExternalTaskSensor(
          external_dag_id='1_hourly_agg',
          external_task_id='print_hourly1',
          task_id='evening_hours_sensor',
          dag=dag)
      

      Hourly DAG definition:

      1_hourly_dag.py
      default_args = {
          …
          'start_date': datetime(2016, 4, 1),
          …
      }
      dag = DAG(dag_id='1_hourly_agg', default_args=default_args, schedule_interval="@hourly")
      
      t1 = BashOperator(
          task_id='print_hourly1',
          bash_command='echo hourly job1',
          dag=dag)
      

      The hourly dag was executed twice for the following execution dates:
      04-01T00:00:00
      04-01T01:00:00

      Then the daily dag was executed, and is still running....
      According to logs, daily dag is waiting for hourly dag to complete:

      [2016-05-04 06:01:20,978] {models.py:1041} INFO - Executing<Task(ExternalTaskSensor): evening_hours_sensor> on 2016-04-03 00:00:00
      [2016-05-04 06:01:20,984] {sensors.py:188} INFO - Poking for 1_hourly_agg.print_hourly1 on 2016-04-02 00:00:00 ... 
      [2016-05-04 06:02:21,053] {sensors.py:188} INFO - Poking for 1_hourly_agg.print_hourly1 on 2016-04-02 00:00:00 ... }}
      

      How can I solve this dead-lock?

      In Addition- I didn't understand if it means that the daily dag depends only on the "last" hourly dag of the same day (23-24pm)?
      What happens if the hourly dag of other hour fails?

      Thanks a lot!

        Attachments

        1. screenshot-1.png
          153 kB
          Hila Visan

          Issue Links

            Activity

              People

              • Assignee:
                hilaviz Hila Visan
                Reporter:
                hilaviz Hila Visan
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: