Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
Description
Currently, when we call Python UDF N times, the Python UDF is constructed N times. This may become a concern to performance, when we want to load large resources in the open() method of the UDF, which is quite common in machine learning use cases.
I propose we optimize in PyFlink framework level s.t. no matter how many times a UDF is called in the execution environment, it is only initiated once.