Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10972

UDFs in SQL joins

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 1.5.1
    • None
    • SQL

    Description

      Currently expressions used to .join() in DataFrames are limited to column names plus the operators exposed in org.apache.spark.sql.Column.

      It would be nice to be able to .join() based on a UDF, such as, say, euclideanDistance(col1, col2) < 0.1.

      Attachments

        Activity

          People

            Unassigned Unassigned
            michaelmalak Michael Malak
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: