Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24547

Spark on K8s docker-image-tool.sh improvements

    XMLWordPrintableJSON

Details

    Description

      Context

      PySpark support for Spark on k8s was merged with https://github.com/apache/spark/pull/21092/files few days ago

      There is a helper script that can be used to create docker containers to run java and now also python jobs. It works like this:

      /path/to/docker-image-tool.sh -r node001:5000/brightcomputing -t v2.4.0 build
      /path/to/docker-image-tool.sh -r node001:5000/brightcomputing -t v2.4.0 push

      Problem

      I ran into three two issues. First time I generated images for 2.4.0 Docker was using it's cache, so actually when running jobs, old jars where still in the Docker image. This produces errors like this in the executors:

      2018-06-13 10:27:52 INFO NettyBlockTransferService:54 - Server created on 172.29.3.4:44877^M 2018-06-13 10:27:52 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy^M 2018-06-13 10:27:52 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(1, 172.29.3.4, 44877, None)^M 2018-06-13 10:27:52 ERROR CoarseGrainedExecutorBackend:91 - Executor self-exiting due to : Unable to create executor due to Exception thrown in awaitResult: ^M org.apache.spark.SparkException: Exception thrown in awaitResult: ^M ^Iat org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)^M ^Iat org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)^M ^Iat org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)^M ^Iat org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)^M ^Iat org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)^M ^Iat org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:241)^M ^Iat org.apache.spark.executor.Executor.<init>(Executor.scala:116)^M ^Iat org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:83)^M ^Iat org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)^M ^Iat org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)^M ^Iat org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)^M ^Iat org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)^M ^Iat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)^M ^Iat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)^M ^Iat java.lang.Thread.run(Thread.java:748)^M Caused by: java.lang.RuntimeException: java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; local class incompatible: stream classdesc serialVersionUID = 6155820641931972169, local class serialVersionUID = -3720498261147521051^M ^Iat java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:687)^M ^Iat java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1880)^M ^Iat java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1746)^M
      

      To avoid that Docker has to build without it's cache, but only if you have build for an older version in the past...

      The second problem was that the spark container is pushed, but the spark-py container wasn't yet. This was just forgotten in the initial PR.

      (A third problem I also ran into because I had an older docker was https://github.com/apache/spark/pull/21551 so I have not included a fix for that in this ticket.)

      Other than that it works great!

      Solution

      I've added an extra flag so it's possible to call build with `-n` for --no-cache`.

      And I've added the extra push for the spark-py container.

      Example

      ./bin/docker-image-tool.sh -r docker.io/myrepo -t v2.3.0 -n build

      Snippet from the help output:

      Options:
      -f file Dockerfile to build for JVM based Jobs. By default builds the Dockerfile shipped with Spark.
      -p file Dockerfile with Python baked in. By default builds the Dockerfile shipped with Spark.
      -r repo Repository address.
      -t tag Tag to apply to the built image, or to identify the image to be pushed.
      -m Use minikube's Docker daemon.
      -n Build docker image with --no-cache

      Attachments

        Activity

          People

            Unassigned Unassigned
            Ray Burgemeestre Ray Burgemeestre
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: