Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29465

Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • Spark Core, Spark Submit, YARN
    • None

    Description

       I'm trying to restrict the ports used by spark app which is launched in yarn cluster mode. All ports (viz. driver, executor, blockmanager) could be specified using the respective properties except the ui port. The spark app is launched using JAVA code and setting the property spark.ui.port in sparkConf doesn't seem to help. Even setting a JVM option -Dspark.ui.port="some_port" does not spawn the UI is required port. 

      From the logs of the spark app, the property spark.ui.port is overridden and the JVM property '-Dspark.ui.port=0' is set even though it is never set to 0. 

      (Run in Spark 1.6.2) From the logs ->

      command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH" JAVA_HOME/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m -Xmx4096m -Djava.io.tmpdir=PWD/tmp '-Dspark.blockManager.port=9900' '-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' '-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' '-Dspark.ui.port=0' '-Dspark.executor.port=9905'

      19/10/14 16:39:59 INFO Utils: Successfully started service 'SparkUI' on port 35167.19/10/14 16:39:59 INFO SparkUI: Started SparkUI at http://10.65.170.98:35167

      Even tried using a spark-submit command with --conf spark.ui.port does spawn UI in required port

      (Run in Spark 2.4.4)
      ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 --conf spark.ui.port=12345 --conf spark.driver.port=12340 --queue default examples/jars/spark-examples_2.11-2.4.4.jar 10

      From the logs::
      19/10/15 00:04:05 INFO ui.SparkUI: Stopped Spark web UI at http://invrh74ace005.informatica.com:46622

      command:JAVA_HOME/bin/java -server -Xmx2048m -Djava.io.tmpdir=PWD/tmp '-Dspark.ui.port=0'  'Dspark.driver.port=12340' -Dspark.yarn.app.container.log.dir=<LOG_DIR> -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@invrh74ace005.informatica.com:12340 --executor-id <executorId> --hostname <hostname> --cores 1 --app-id application_1570992022035_0089 --user-class-path file:$PWD/_app_.jar1><LOG_DIR>/stdout2><LOG_DIR>/stderr

       

      Looks like the application master override this and set a JVM property before launch resulting in random UI port even though spark.ui.port is set by the user.

      In these links

      1. https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala (line 214)
      2. https://github.com/cloudera/spark/blob/master/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala (line 75)

      I can see that the method run() in above files sets a system property UI_PORT and spark.ui.port respectively.

      Attachments

        Issue Links

          Activity

            People

              vishwasn Vishwas Nalka
              vishwasn Vishwas Nalka
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: