Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-3542

next_ds semantics broken for manually triggered runs

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.10.2
    • Fix Version/s: 1.10.0
    • Component/s: scheduler
    • Labels:
      None

      Description

      next_ds is useful when you need cron-style scheduling, e.g. a task that runs for date "X" uses that date for its logic, e.g. send an email to users saying the run that was supposed to run for date "X" has completed. The problem is it doesn't behave as expected when it comes to manually triggered runs as illustrated by the diagrams below.
       
      Using execution_date in a task
      Scheduled Run (works as expected)
      execution_date1           start_date1
      \/                                  \/
       |-----------------------------|
      /\                                  /\
       _________________/
         scheduling_interval
       
      Manual Run (works as expected)

      triggered_date + execution_date + start_date
      \/
      |
       
      Using next_ds in a Task
      Scheduled Run (works as expected)
      next_ds1 + start_date1           next_ds2 + start_date2
      \/                                                         \/
       |------------------------------------------------|
      /\                                                         /\
       ____________________________/
                   scheduling_interval
       
      Manual Run (next_ds1 is expected to match triggered_date as in the case for the manually triggered run that uses the regular execution_date above)
      triggered_date                                    next_ds1 + start_date
      \/                                                         \/
      |-------------------------------------------------|
      /\                                                         /\
       ____________________________/
                   0 to scheduling_interval (depending on when the next execution date is)
      Proposal
      Have next_ds always set to execution_date for manually triggered runs instead of the next schedule-interval aligned execution date.
       
      This might break backwards compatibility for some users but it can be argued that the current functionality is a bug. If it's really desired we can create new aliases that behave logically although I am against this.
       
      prev_ds should probably also be made consistent with this logic.

        Attachments

          Activity

            People

            • Assignee:
              aoen Dan Davydov
              Reporter:
              aoen Dan Davydov
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: