Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19925

SparkR spark.getSparkFiles fails on executor

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.1.0
    • Fix Version/s: 2.1.1, 2.2.0
    • Component/s: SparkR
    • Labels:
      None
    • Target Version/s:

      Description

      SparkR function spark.getSparkFiles fails when it was called on executors. For examples, the following R code will fail. (See error logs in attachment.)

      spark.addFile("./README.md")
      seq <- seq(from = 1, to = 10, length.out = 5)
      train <- function(seq) {
      path <- spark.getSparkFiles("README.md")
      print(path)
      }
      spark.lapply(seq, train)
      

      However, we can run successfully with Scala API:

      import org.apache.spark.SparkFiles
      sc.addFile("./README.md”)
      sc.parallelize(Seq(0)).map{ _ => SparkFiles.get("README.md")}.first()
      

      and also successfully with Python API:

      from pyspark import SparkFiles
      sc.addFile("./README.md")
      sc.parallelize(range(1)).map(lambda x: SparkFiles.get("README.md")).first()
      

        Attachments

        1. error-log
          18 kB
          Yanbo Liang

          Activity

            People

            • Assignee:
              yanboliang Yanbo Liang
              Reporter:
              yanboliang Yanbo Liang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: