Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6814

Support sorting for any data type in SparkR

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Critical
    • Resolution: Won't Fix
    • None
    • None
    • SparkR
    • None

    Description

      I get various "return status == 0 is false" and "unimplemented type" errors trying to get data out of any rdd with top() or collect(). The errors are not consistent. I think spark is installed properly because some operations do work. I apologize if I'm missing something easy or not providing the right diagnostic info – I'm new to SparkR, and this seems to be the only resource for SparkR issues.
      Some logs:

      Browse[1]> top(estep.rdd, 1L)
      Error in order(unlist(part, recursive = FALSE), decreasing = !ascending) : 
        unimplemented type 'list' in 'orderVector1'
      Calls: do.call ... Reduce -> <Anonymous> -> func -> FUN -> FUN -> order
      Execution halted
      15/02/13 19:11:57 ERROR Executor: Exception in task 0.0 in stage 14.0 (TID 14)
      org.apache.spark.SparkException: R computation failed with
       Error in order(unlist(part, recursive = FALSE), decreasing = !ascending) : 
        unimplemented type 'list' in 'orderVector1'
      Calls: do.call ... Reduce -> <Anonymous> -> func -> FUN -> FUN -> order
      Execution halted
      	at edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:69)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
      	at org.apache.spark.scheduler.Task.run(Task.scala:54)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      15/02/13 19:11:57 WARN TaskSetManager: Lost task 0.0 in stage 14.0 (TID 14, localhost): org.apache.spark.SparkException: R computation failed with
       Error in order(unlist(part, recursive = FALSE), decreasing = !ascending) : 
        unimplemented type 'list' in 'orderVector1'
      Calls: do.call ... Reduce -> <Anonymous> -> func -> FUN -> FUN -> order
      Execution halted
              edu.berkeley.cs.amplab.sparkr.BaseRRDD.compute(RRDD.scala:69)
              org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
              org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
              org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
              org.apache.spark.scheduler.Task.run(Task.scala:54)
              org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
              java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              java.lang.Thread.run(Thread.java:745)
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            shivaram Shivaram Venkataraman
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: