Parsing of the URL for connection(using AIRFLOW_CONN_ environment variables) does not perform full URL-decode on the URL. It only handles hard-coded %2f encode to support "/" in hostname. However there are valid cases where the hostname, login, password, and query parameters can contain url-encoded values. For example in cloud-sql-proxy, generated socket path contains ":" (for example /cloudsql/myProject:us-central1:myInstance)
We need to URL-encode ":" because otherwise urlparse will treat those ":" as separator for port number. Similarly user/password can contain url-encoded characters.
I think we should fully URL-decode all relevant URL fields (including query parameters, user, password, hostname, path). However it is potentially breaking change (if someone has a user/password/hostname with % ) so maybe we should do some compromises around that (for example not decode the password - which are likely to contain '%' characters) although that would violate URL encoding/decoding specification.
I will provide proposed fix shortly
- links to