Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29974

Submitting with application jar on HA HDFS

    XMLWordPrintableJSON

    Details

      Description

      When submitting a job with the application jar in an HA HDFS and with the HDFS configuration available to both the driver and the executors at $HADOOP_CONF_DIR, the executor can't fetch the application jar.

       

      For example with Kubernetes:

      1. Create a Spark image with the HA HDFS configuration files available at $HADOOP_CONF_DIR.
      2. Push the application jar to the HA HDFS.
      3. Use spark-submit to create the spark job in the cluster
        spark-submit \
        	--master k8s://https://kubernetes.example:6443 \
        	--deploy-mode cluster \
        	--name spark_hdfs_test \
        	--class $CLASS \
        	--conf spark.executor.instances=3 \
        	--conf spark.kubernetes.container.image=$SPARK_IMAGE \
        	--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
        	hdfs:///jars/application.jar
        

         On the driver, all goes well, but the following error shows on the log of all executors:

        ...
        19/11/20 12:45:43 INFO Executor: Fetching hdfs://hdfs-k8s/jars/application.jar with timestamp 1574253925510
        19/11/20 12:45:43 ERROR Executor: Exception in task 0.1 in stage 0.0 (TID 1)
        java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfs-k8s
        	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
        	at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
        	at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
        	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
        	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
        	at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
        	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
        	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
        	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
        	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
        	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
        	at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1866)
        	at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:721)
        	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:496)
        	at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:811)
        	at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:803)
        	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
        	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
        	at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
        	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
        	at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
        	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:803)
        	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:375)
        	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        	at java.lang.Thread.run(Thread.java:748)
        Caused by: java.net.UnknownHostException: hdfs-k8s
        	... 28 more
        

        The traceback suggests that when the executor wants to fetch the application jar, it does not understand that the path corresponds to an HA HDFS. (Which it should as the HDFS HA configuration is available).

      However when the path to the application jar is set with the address to the active namenode, then all works well. Even the code in the jar which itself uses HA HDFS (hdfs:///some-file.txt).

      // code placeholder
      spark-submit \
      	--master k8s://https://kubernetes.example:6443 \
      	--deploy-mode cluster \
      	--name spark_hdfs_test \
      	--class $CLASS \
      	--conf spark.executor.instances=3 \
      	--conf spark.kubernetes.container.image=$SPARK_IMAGE \
      	--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
      	hdfs://hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020/jars/application.jar
      

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                c-meier Christopher Meier
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: