[SPARK-30127] UDF should work for case class like Dataset operations - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.0
Fix Version/s: 3.0.0
Component/s: SQL
Labels:
None

Description

Currently, Spark UDF can only work on data types like java.lang.String, o.a.s.sql.Row, Seq[_], etc. This is inconvenient if you want to apply an operation on one column, and the column is struct type. You must access data from a Row object, instead of your domain object like Dataset operations. It will be great if UDF can work on types that are supported by Dataset, e.g. case classes.

Note that, there are multiple ways to register a UDF, and it's only possible to support this feature if the UDF is registered using Scala API that provides type tag, e.g. `def udf[RT: TypeTag, A1: TypeTag](f: Function1[A1, RT])`

Attachments

Issue Links

links to

GitHub Pull Request #27937

Activity

People

Assignee:: wuyi

Reporter:: Wenchen Fan

Votes:: 6 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 04/Dec/19 15:32

Updated:: 15/Jul/20 01:49

Resolved:: 24/Mar/20 15:30