Description
Currently, UDF's type coercion is not cleanly defined. See also https://github.com/apache/spark/pull/22610 and https://github.com/apache/spark/pull/22610
This JIRA targets to describe the type conversion logic internally. For instance:
+---------------+----+----+----+----+----+------------------------------+------------------------------+----+---------------+---------+----------------------------+------------+----+------------+---------+---------+ # noqa | Type \ Value|None|True| 1| a| a| 1970-01-01| 1970-01-01 00:00:00| 1.0|array('i', [1])| [1]| (1,)| ABC| 1| {'a': 1}| Row(a=1)| Row(a=1)| # noqa +---------------+----+----+----+----+----+------------------------------+------------------------------+----+---------------+---------+----------------------------+------------+----+------------+---------+---------+ # noqa | null|None|None|None|None|None| None| None|None| None| None| None| None|None| None| X| X| # noqa | boolean|None|True|None|None|None| None| None|None| None| None| None| None|None| None| X| X| # noqa | tinyint|None|None| 1|None|None| None| None|None| None| None| None| None|None| None| X| X| # noqa | smallint|None|None| 1|None|None| None| None|None| None| None| None| None|None| None| X| X| # noqa | int|None|None| 1|None|None| None| None|None| None| None| None| None|None| None| X| X| # noqa | bigint|None|None| 1|None|None| None| None|None| None| None| None| None|None| None| X| X| # noqa | string|None|true| 1| a| a|java.util.GregorianCalendar...|java.util.GregorianCalendar...| 1.0| [I@2d03fe27| [1]|[Ljava.lang.Object;@5ae74a34| [B@6e96d01e| 1| {a=1}| X| X| # noqa | date|None| X| X| X| X| 1970-01-01| 1970-01-01| X| X| X| X| X| X| X| X| X| # noqa | timestamp|None| X| X| X| X| X| 1970-01-01 00:00:00| X| X| X| X| X| X| X| X| X| # noqa | float|None|None|None|None|None| None| None| 1.0| None| None| None| None|None| None| X| X| # noqa | double|None|None|None|None|None| None| None| 1.0| None| None| None| None|None| None| X| X| # noqa | array<int>|None|None|None|None|None| None| None|None| [1]| [1]| [1]|[65, 66, 67]|None| None| X| X| # noqa | binary|None|None|None| a| a| None| None|None| None| None| None| ABC|None| None| X| X| # noqa | decimal(10,0)|None|None|None|None|None| None| None|None| None| None| None| None| 1| None| X| X| # noqa |map<string,int>|None|None|None|None|None| None| None|None| None| None| None| None|None| {u'a': 1}| X| X| # noqa | struct<_1:int>|None| X| X| X| X| X| X| X| X|Row(_1=1)| Row(_1=1)| X| X|Row(_1=None)|Row(_1=1)|Row(_1=1)| # noqa +---------------+----+----+----+----+----+------------------------------+------------------------------+----+---------------+---------+----------------------------+------------+----+------------+---------+---------+ # noqa
Attachments
Issue Links
- is related to
-
SPARK-25798 Internally document type conversion between Pandas data and SQL types in Pandas UDFs
- Resolved
-
SPARK-28131 Update document type conversion between Python data and SQL types in normal UDFs (Python 3.7)
- Resolved
- links to