Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38130

array_sort does not allow non-orderable datatypes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.1
    • 3.3.0
    • SQL
    • None
    •  

    Description

       array_sort has check to see if the entries it has to sort are orderable.

      I think this check should be removed.  Because even entries which are not orderable can have a lambda function which makes them orderable.

      Seq((Array[Map[String, Int]](Map("a" -> 1), Map()), "x")).toDF("a", "b").selectExpr("array_sort(a, (x,y) -> cardinality(x) - cardinality(y))")

      fails with:

      org.apache.spark.sql.AnalysisException: cannot resolve 'array_sort(`a`, lambdafunction((cardinality(namedlambdavariable()) - cardinality(namedlambdavariable())), namedlambdavariable(), namedlambdavariable()))' due to data type mismatch: array_sort does not support sorting array of type map<string,int> which is not orderable 

      While the case where this check is relevant, fails with a different error which is triggered earlier in the code path:

      > Seq((Array[Map[String, Int]](Map("a" -> 1), Map()), "x")).toDF("a", "b").selectExpr("array_sort(a)")

      Fails with:

      org.apache.spark.sql.AnalysisException: cannot resolve '(namedlambdavariable() < namedlambdavariable())' due to data type mismatch: LessThan does not support ordering on type map<string,int>; line 1 pos 0;
      

      Attachments

        Activity

          People

            steven.aerts Steven Aerts
            steven.aerts Steven Aerts
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: