Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27240

Use pandas DataFrame for struct type argument in Scalar Pandas UDF.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 3.0.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      Now that we support returning pandas DataFrame for struct type in Scalar Pandas UDF.
      If we chain another Pandas UDF after the Scalar Pandas UDF returning pandas DataFrame, the argument of the chained UDF will be pandas DataFrame, but currently we don't support pandas DataFrame as an argument of Scalar Pandas UDF. That means there is an inconsistency between the chained UDF and the single UDF.
      We should support taking pandas DataFrame for struct type argument in Scala Pandas UDF to be consistent.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ueshin Takuya Ueshin
                Reporter:
                ueshin Takuya Ueshin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: