Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31517

SparkR::orderBy with multiple columns descending produces error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.5
    • 3.1.0, 3.2.0
    • SparkR
    • None
    • Databricks Runtime 6.5

    Description

      When specifying two columns within an `orderBy()` function, to attempt to get an ordering by two columns in descending order, an error is returned.

      library(magrittr) 
      library(SparkR) 
      cars <- cbind(model = rownames(mtcars), mtcars) 
      carsDF <- createDataFrame(cars) 
      
      carsDF %>% 
        mutate(rank = over(rank(), orderBy(windowPartitionBy(column("cyl")), desc(column("mpg")), desc(column("disp"))))) %>% 
        head() 

      This returns an error:

       Error in ns[[i]] : subscript out of bounds

      This seems to be related to the more general issue that the following code, excluding the use of the `desc()` function also fails:

      carsDF %>% 
        mutate(rank = over(rank(), orderBy(windowPartitionBy(column("cyl")), column("mpg"), column("disp")))) %>% 
        head()

       

      Attachments

        Issue Links

          Activity

            People

              michaelchirico Michael Chirico
              rossbowen Ross Bowen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: