Description
After SPARK-19161, we happened to break callable objects as UDFs in Python as below:
>>> from pyspark.sql import functions >>> class F(object): ... def __call__(self, x): ... return x ... >>> foo = F() >>> foo(1) 1 >>> udf = functions.udf(foo) Traceback (most recent call last): File "<stdin>", line 1, in <module> File ".../spark/python/pyspark/sql/functions.py", line 2142, in udf return _udf(f=f, returnType=returnType) File ".../spark/python/pyspark/sql/functions.py", line 2133, in _udf return udf_obj._wrapped() File ".../spark/python/pyspark/sql/functions.py", line 2090, in _wrapped @functools.wraps(self.func) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 33, in update_wrapper setattr(wrapper, attr, getattr(wrapped, attr)) AttributeError: F instance has no attribute '__name__'
Note that this works in Spark 2.1 as below:
>>> from pyspark.sql import functions >>> class F(object): ... def __call__(self, x): ... return x ... >>> foo = F() >>> foo(1) 1 >>> udf = functions.udf(foo) >>> spark.range(1).select(udf("id")).show() +-----+ |F(id)| +-----+ | 0| +-----+