Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22939

Support Spark UDF in registerFunction

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.3.0
    • PySpark, SQL
    • None

    Description

      import random
      from pyspark.sql.functions import udf
      from pyspark.sql.types import IntegerType, StringType
      random_udf = udf(lambda: int(random.random() * 100), IntegerType()).asNondeterministic()
      spark.catalog.registerFunction("random_udf", random_udf, StringType())
      spark.sql("SELECT random_udf()").collect()
      

      We will get the following error.

      Py4JError: An error occurred while calling o29.__getnewargs__. Trace:
      py4j.Py4JException: Method __getnewargs__([]) does not exist
      	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
      	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
      	at py4j.Gateway.invoke(Gateway.java:274)
      	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
      	at py4j.commands.CallCommand.execute(CallCommand.java:79)
      	at py4j.GatewayConnection.run(GatewayConnection.java:214)
      	at java.lang.Thread.run(Thread.java:745)
      

      Attachments

        Activity

          People

            smilegator Xiao Li
            smilegator Xiao Li
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: