Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43797 Python User-defined Table Functions
  3. SPARK-48180

Analyzer bug with multiple ORDER BY items for input table argument

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0, 4.0.0, 3.5.1
    • 4.0.0
    • PySpark

    Description

      Steps to reproduce:

       

      from pyspark.sql.functions import udtf

      @udtf(returnType="a: int, b: int")
      class tvf:
        def eval(self, *args):
          yield 1, 2

       

      SELECT * FROM tvf(
        TABLE(
          SELECT 1 AS device_id, 2 AS data_ds
          )
          WITH SINGLE PARTITION
          ORDER BY device_id, data_ds
       )

      [UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_TABLE_ARGUMENT] Unsupported subquery expression: Table arguments are used in a function where they are not supported:
      'UnresolvedTableValuedFunction [tvf], table-argument#338 [], 'data_ds, false
         +- Project 1 AS device_id#336, 2 AS data_ds#337
            +- OneRowRelation

      Attachments

        Issue Links

          Activity

            People

              dtenedor Daniel
              dtenedor Daniel
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: