Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23988

[Mesos] Improve handling of appResource in mesos dispatcher when using Docker



    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 2.2.1, 2.3.0
    • None
    • Mesos


      Our organization makes heavy use of Docker containers when running Spark on Mesos. The images we use for our containers include Spark along with all of the application dependencies. We find this to be a great way to manage our artifacts.

      When specifying the primary application jar (i.e. appResource), the mesos dispatcher insists on adding it to the list of URIs for Mesos to fetch as part of launching the driver's container. This leads to confusing behavior where paths such as:

      wind up being fetched from the host where the driver is running. Obviously, this doesn't work since all of the above examples are referencing the path of the jar on the container image itself.

      Here is an example that I used for testing:

      spark-submit \
        --class org.apache.spark.examples.SparkPi \
        --master mesos://spark-dispatcher \
        --deploy-mode cluster \
        --conf spark.cores.max=4 \
        --conf spark.mesos.executor.docker.image=spark:2.2.1 \
        local:/usr/local/spark/examples/jars/spark-examples_2.11-2.2.1.jar 10

      The "spark:2.2.1" image contains an installation of spark under "/usr/local/spark". Notice how we reference the appResource using the "local:/" scheme.

      If you try the above with the current version of the mesos dispatcher, it will try to fetch the path "/usr/local/spark/examples/jars/spark-examples_2.11-2.2.1.jar" from the host filesystem where the driver's container is running. On our systems, this fails since we don't have spark installed on the hosts. 

      For the PR, all I did was modify the mesos dispatcher to not add the "appResource to the list of URIs for Mesos to fetch if it uses the "local:/" scheme.

      For now, I didn't change the behavior of absolute paths or the "file:/" scheme because I wanted to leave some form for the old behavior in place for backwards compatibility. Anyone have any opinions on whether these schemes should change as well?

      The PR also includes support for using "spark-internal" with Mesos in cluster mode which is something we need for another use-case. I can separate them if that makes more sense.





            Unassigned Unassigned
            adobe_pmackles paul mackles
            0 Vote for this issue
            2 Start watching this issue