Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21859

SparkFiles.get failed on driver in yarn-cluster and yarn-client mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 1.6.2
    • None
    • Spark Core

    Description

      when use SparkFiles.get a file on driver in yarn-client or yarn-cluster, it will report file not found exception.
      This exception only happens on driver, SparkFiles.get works fine on executor.

      we can reproduce the bug as follows:
      ```scala
      def testOnDriver(fileName: String) = {
      val file = new File(SparkFiles.get(fileName))
      if (!file.exists())

      { logging.info(s"$file not exist") }

      else {
      // print file content on driver
      val content = Source.fromFile(file).getLines().mkString("\n")
      logging.info(s"File content: ${content}")
      }
      }
      // the output will be file not exist
      ```

      ```python
      conf = SparkConf().setAppName("test files")
      sc = SparkContext(appName="spark files test")

      def test_on_driver(filename):
      file = SparkFiles.get(filename)
      print("file path: {}".format(file))
      if os.path.exists(file):
      with open(file) as f:
      lines = f.readlines()
      print(lines)
      else:
      print("file doesn't exist")
      run_command("ls .")
      ```

      Attachments

        Activity

          People

            Unassigned Unassigned
            lgrcyanny Cyanny
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: