Uploaded image for project: 'Apache Airflow'
  1. Apache Airflow
  2. AIRFLOW-3405

Task instance fail intermittently due to MySQL error

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: database
    • Environment:
      MySQL, Redhat Linux

      Description

      Dags are getting failed intermittently due to below error. 

      OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many connections')

      [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!

      We have max_connections defined as 2000 in DB. 

      Below are the setting in cfg.

      sql_alchemy_pool_size = 1980

      sql_alchemy_pool_recycle = 3600

      As per DBA, The airflow scheduler keeps opening connections to the database, these connections are mostly idle, they get reset whenever the scheduler restarts but with max_connections at 2000 and scheduler holding on to 1600 of these, other apps trying to connect might start running out of connections.

      How do we remediate these idle connections. What should be the optimal value for these configs and max_connections that to be set at DB. Consider we need to build a large environment serving 500+ definitions with 10000+ runs per day. Need suggestions...  

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              connect2uv Yuva
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: