Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30127

UDF should work for case class like Dataset operations

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.0.0
    • SQL
    • None

    Description

      Currently, Spark UDF can only work on data types like java.lang.String, o.a.s.sql.Row, Seq[_], etc. This is inconvenient if you want to apply an operation on one column, and the column is struct type. You must access data from a Row object, instead of your domain object like Dataset operations. It will be great if UDF can work on types that are supported by Dataset, e.g. case classes.

      Note that, there are multiple ways to register a UDF, and it's only possible to support this feature if the UDF is registered using Scala API that provides type tag, e.g. `def udf[RT: TypeTag, A1: TypeTag](f: Function1[A1, RT])`

      Attachments

        Issue Links

          Activity

            People

              Ngone51 wuyi
              cloud_fan Wenchen Fan
              Votes:
              6 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: