Dunno if I'm misinterpreting something here, but this seems like a bug in how UDFs work, or in how they interface with the optimizer.
Here's a basic reproduction. I'm using length_udf() just for illustration; it could be any UDF that accesses fields that have been aliased.
When I run this I get a long stack trace, but the relevant portions seem to be:
Here are the relevant execution plans:
It looks like from the second execution plan that BatchEvalPython somehow gets the unaliased column names, whereas the Project right above it gets the aliased names.