Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28992

Support update dependencies from hdfs when task run on executor pods

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.1.0
    • None
    • Kubernetes, Spark Core
    • None

    Description

      Here is a case: 

      bin/spark-submit  --class com.github.ehiggs.spark.terasort.TeraSort hdfs://hz-cluster10/user/kyuubi/udf/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar hdfs://hz-cluster10/user/kyuubi/terasort/1000g hdfs://hz-cluster10/user/kyuubi/terasort/1000g-out1
      

      Spark supports add jar logic and application-jar from hdfs - -  http://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit

      Take spark on yarn for example, it creates a _spark_hadoop_conf_.xml file and upload the hadoop distribute cache, the executor processes can use this to identify where their dependencies located.

      But on k8s, i tried and failed to update dependencies.

      19/09/04 08:08:52 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (newAPIHadoopFile at TeraSort.scala:60) failed in 1.058 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 9, 100.66.0.75, executor 2): java.lang.IllegalArgumentException: java.net.UnknownHostException: hz-cluster10
      19/09/04 08:08:52 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (newAPIHadoopFile at TeraSort.scala:60) failed in 1.058 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 9, 100.66.0.75, executor 2): java.lang.IllegalArgumentException: java.net.UnknownHostException: hz-cluster10 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1881) at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:737) at org.apache.spark.util.Utils$.fetchFile(Utils.scala:522) at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869) at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860) at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:792) at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149) at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237) at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44) at scala.collection.mutable.HashMap.foreach(HashMap.scala:149) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:791) at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
      

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Qin Yao Kent Yao 2
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: