Dags are getting failed intermittently due to below error.
OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many connections')
[2018-11-25 12:24:16,952] - Heartbeat time limited exceeded!
We have max_connections defined as 2000 in DB.
Below are the setting in cfg.
sql_alchemy_pool_size = 1980
sql_alchemy_pool_recycle = 3600
As per DBA, The airflow scheduler keeps opening connections to the database, these connections are mostly idle, they get reset whenever the scheduler restarts but with max_connections at 2000 and scheduler holding on to 1600 of these, other apps trying to connect might start running out of connections.
How do we remediate these idle connections. What should be the optimal value for these configs and max_connections that to be set at DB. Consider we need to build a large environment serving 500+ definitions with 10000+ runs per day. Need suggestions...