Description
Currently the cache manager doesn't use the cache for udf if the udf is created again even if the functions is the same.
>>> func = lambda x: x >>> df = spark.range(1) >>> df.select(udf(func)("id")).cache() >>> df.select(udf(func)("id")).explain() == Physical Plan == *(2) Project [pythonUDF0#14 AS <lambda>(id)#12] +- BatchEvalPython [<lambda>(id#0L)], [pythonUDF0#14] +- *(1) Range (0, 1, step=1, splits=12)