Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40307 Introduce Arrow Python UDFs
  3. SPARK-48667

Arrow python UDFS didn't support UDT as output type

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.5.1, 3.4.3
    • None
    • PySpark
    • None

    Description

      df.select(udf(lambda x: x, returnType=ExamplePointUDT(), useArrow=useArrow)("point")), 

       

       java.lang.AssertionError: assertion failed: Invalid schema from pandas_udf: expected org.apache.spark.sql.test.ExamplePointUDT@49ccc723, StructType(StructField(st,StructType(StructField(tt,TimestampType,true)),true)), got ArrayType(DoubleType,false)
       

      Attachments

        Activity

          People

            Unassigned Unassigned
            angerszhuuu angerszhu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: