Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19925

SparkR spark.getSparkFiles fails on executor

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.1.0
    • 2.1.1, 2.2.0
    • SparkR
    • None

    Description

      SparkR function spark.getSparkFiles fails when it was called on executors. For examples, the following R code will fail. (See error logs in attachment.)

      spark.addFile("./README.md")
      seq <- seq(from = 1, to = 10, length.out = 5)
      train <- function(seq) {
      path <- spark.getSparkFiles("README.md")
      print(path)
      }
      spark.lapply(seq, train)
      

      However, we can run successfully with Scala API:

      import org.apache.spark.SparkFiles
      sc.addFile("./README.md”)
      sc.parallelize(Seq(0)).map{ _ => SparkFiles.get("README.md")}.first()
      

      and also successfully with Python API:

      from pyspark import SparkFiles
      sc.addFile("./README.md")
      sc.parallelize(range(1)).map(lambda x: SparkFiles.get("README.md")).first()
      

      Attachments

        1. error-log
          18 kB
          Yanbo Liang

        Activity

          People

            yanboliang Yanbo Liang
            yanboliang Yanbo Liang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: