The SQL configurations are propagated to executors in order to be effective.
Unfortunately, in some cases, we are missing to propagate them, making them un-effective.
The problem happens every time rdd or queryExecution.toRdd are used. And this is pretty frequent in the codebase.
Please notice that there are 2 parts of this issue:
- when a user directly uses those APIs
- when Spark invokes them (eg. throughout the ML lib and other usages or the describe method on the Dataset class)