Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.5.1
-
None
Description
Currently expressions used to .join() in DataFrames are limited to column names plus the operators exposed in org.apache.spark.sql.Column.
It would be nice to be able to .join() based on a UDF, such as, say, euclideanDistance(col1, col2) < 0.1.