[SPARK-38981] Unexpected commutative property of udf/pandas_udf and filters - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.2.1
Fix Version/s: None
Component/s: Optimizer, PySpark
Labels:
None

Language:
- Python

Description

Hello all,
When running the attached minmal working example in the attachments, the order of the filter and the UDF is swapped by the optimizer. This can lead to errors, which are difficult to debug. In the documentation I have found no reference to such behavior.

Is this a bug or a functionality which is poorly documented?

With kind regards,
Max

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

optimization_udf_filter.html
21/Apr/22 10:01
615 kB
Maximilian Sackel
screenshot-1.png
21/Apr/22 09:49
95 kB
Maximilian Sackel
screenshot-2.png
21/Apr/22 09:50
92 kB
Maximilian Sackel

Activity

People

Assignee:: Unassigned

Reporter:: Maximilian Sackel

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 21/Apr/22 09:49

Updated:: 12/Dec/22 18:11