PySpark is able to load numpy functions, but not scipy.special functions. For example take this snippet:
Calling c.collect() will return the expected result. However, calling d.collect() will fail with
in cloudpickle.py module in _getobject.
The reason is that getobject executes import(modname), which only loads the top-level package X in case modname is like X.Y. It is failing because gammaln is not a member of scipy. The fix (for which I will shortly submit a PR) is to add fromlist=[attribute] to the import_ call, which will load the innermost module.