Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-5639

DagFileProcessor: parse dag files every time, which consume lots of resources and is unnecessary

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 1.10.5
    • Fix Version/s: None
    • Component/s: scheduler
    • Labels:
      None

      Description

       

      Code

      https://github.com/apache/airflow/blob/v1-10-stable/airflow/models/dagbag.py#L166-L170

      Problem description 

      self.file_last_changed doesn't work, beacause it's set to empty when Dagbag is initialized. so even no changes were made to the file, it will also be imported. I have sure this problem by print log.

      Generate bagdags from files take about 50% time of dag files processing, if only generate bagdags when files are changed,  lots of resources will be saved.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              chiven chen xianxin
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: