Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22939

Support Spark UDF in registerFunction

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.0
    • Component/s: PySpark, SQL
    • Labels:
      None

      Description

      import random
      from pyspark.sql.functions import udf
      from pyspark.sql.types import IntegerType, StringType
      random_udf = udf(lambda: int(random.random() * 100), IntegerType()).asNondeterministic()
      spark.catalog.registerFunction("random_udf", random_udf, StringType())
      spark.sql("SELECT random_udf()").collect()
      

      We will get the following error.

      Py4JError: An error occurred while calling o29.__getnewargs__. Trace:
      py4j.Py4JException: Method __getnewargs__([]) does not exist
      	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
      	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
      	at py4j.Gateway.invoke(Gateway.java:274)
      	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
      	at py4j.commands.CallCommand.execute(CallCommand.java:79)
      	at py4j.GatewayConnection.run(GatewayConnection.java:214)
      	at java.lang.Thread.run(Thread.java:745)
      

        Attachments

          Activity

            People

            • Assignee:
              smilegator Xiao Li
              Reporter:
              smilegator Xiao Li
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: