Currently, when converting from Pandas to Arrow for Pandas UDF return values or from createDataFrame(), PySpark will catch all ArrowExceptions and display info on how to disable the safe conversion config. This is displayed with the original error as a tuple:
The problem is that this is meant mainly for thing like float truncation or overflow, but will also show if the user has an invalid schema with types that are incompatible. The extra information is confusing in this case and the real error is buried.
This could be improved by only printing the extra info on how to disable safe checking if the config is actually set and using exception chaining to better show the original error. Also, any safe failures would be a ValueError, which ArrowInvaildError is a subclass, so the catch could be made more narrow.