Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26290

[K8s] Driver Pods no mounted volumes on submissions from older spark versions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Bug
    • 2.4.0
    • None
    • Kubernetes, Spark Core
    • None

    Description

      I want to use the volume feature to mount an existing PVC as readonly volume into the driver and also executor. 

      The executor gets the PVC mounted, but the driver is missing the mount 

      /opt/spark/bin/spark-submit \
      --deploy-mode cluster \
      --class org.apache.spark.examples.SparkPi \
      --conf spark.app.name=spark-pi \
      --conf spark.executor.instances=4 \
      --conf spark.kubernetes.namespace=spark-demo \
      --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
      --conf spark.kubernetes.container.image.pullPolicy=Always \
      --conf spark.kubernetes.container.image=kube-spark:2.4.0 \
      --conf spark.master=k8s://https://<master-ip> \
      --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.path=/srv \
      --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.readOnly=true \
      --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.options.claimName=nfs-pvc \
      --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path=/srv \
      --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.readOnly=true \
      --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName=nfs-pvc \
      /srv/spark-examples_2.11-2.4.0.jar
      

      When i use the jar included in the container

      local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
      

      the call works and i can review the pod descriptions to review the behavior

      Driver description

      Name:         spark-pi-1544018157391-driver
      [...]
      Containers:
        spark-kubernetes-driver:
          Container ID:   docker://3a31d867c140183247cb296e13a8b35d03835f7657dd7e625c59083024e51e28
          Image:          kube-spark:2.4.0
          Image ID:       [...]
          Port:           <none>
          Host Port:      <none>
          State:          Terminated
            Reason:       Completed
            Exit Code:    0
            Started:      Wed, 05 Dec 2018 14:55:59 +0100
            Finished:     Wed, 05 Dec 2018 14:56:08 +0100
          Ready:          False
          Restart Count:  0
          Limits:
            memory:  1408Mi
          Requests:
            cpu:     1
            memory:  1Gi
          Environment:
            SPARK_DRIVER_MEMORY:        1g
            SPARK_DRIVER_CLASS:         org.apache.spark.examples.SparkPi
            SPARK_DRIVER_ARGS:
            SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)
            SPARK_MOUNTED_CLASSPATH:    /opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
            SPARK_JAVA_OPT_1:           -Dspark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path=/srv
            SPARK_JAVA_OPT_3:           -Dspark.app.name=spark-pi
            SPARK_JAVA_OPT_4:           -Dspark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.path=/srv
            SPARK_JAVA_OPT_5:           -Dspark.submit.deployMode=cluster
            SPARK_JAVA_OPT_6:           -Dspark.driver.blockManager.port=7079
            SPARK_JAVA_OPT_7:           -Dspark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.mount.readOnly=true
            SPARK_JAVA_OPT_8:           -Dspark.kubernetes.authenticate.driver.serviceAccountName=spark
            SPARK_JAVA_OPT_9:           -Dspark.driver.host=spark-pi-1544018157391-driver-svc.spark-demo.svc.cluster.local
            SPARK_JAVA_OPT_10:          -Dspark.kubernetes.driver.pod.name=spark-pi-1544018157391-driver
            SPARK_JAVA_OPT_11:          -Dspark.kubernetes.driver.volumes.persistentVolumeClaim.ddata.options.claimName=nfs-pvc
            SPARK_JAVA_OPT_12:          -Dspark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.readOnly=true
            SPARK_JAVA_OPT_13:          -Dspark.driver.port=7078
            SPARK_JAVA_OPT_14:          -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar
            SPARK_JAVA_OPT_15:          -Dspark.kubernetes.executor.podNamePrefix=spark-pi-1544018157391
            SPARK_JAVA_OPT_16:          -Dspark.local.dir=/tmp/spark-local
            SPARK_JAVA_OPT_17:          -Dspark.master=k8s://https://<master-ip>
            SPARK_JAVA_OPT_18:          -Dspark.app.id=spark-89420bd5fa8948c3aa9d14a4eb6ecfca
            SPARK_JAVA_OPT_19:          -Dspark.kubernetes.namespace=spark-demo
            SPARK_JAVA_OPT_21:          -Dspark.executor.instances=4
            SPARK_JAVA_OPT_22:          -Dspark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName=nfs-pvc
            SPARK_JAVA_OPT_23:          -Dspark.kubernetes.container.image=kube-spark:2.4.0
            SPARK_JAVA_OPT_24:          -Dspark.kubernetes.container.image.pullPolicy=Always
          Mounts:
            /tmp/spark-local from spark-local-dir-0-spark-local (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from spark-token-nhcdd (ro)
      Conditions:
        Type           Status
        Initialized    True 
        Ready          False 
        PodScheduled   True 
      Volumes:
        spark-local-dir-0-spark-local:
          Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
          Medium:  
        spark-token-nhcdd:
          Type:        Secret (a volume populated by a Secret)
          SecretName:  spark-token-nhcdd
          Optional:    false
      QoS Class:       Burstable
      Node-Selectors:  <none>
      Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                       node.kubernetes.io/unreachable:NoExecute for 300s
      Events:          <none>
      

      Executor description:

      Name:                      spark-pi-1544018157391-exec-2
      [..]
      Controlled By:             Pod/spark-pi-1544018157391-driver
      Containers:
        executor:
          Container ID:  docker://053256f023805a0a2fa580815f78203d2a32b0bc4e8e17741f45d84dd20a5e44
          Image:         kube-spark:2.4.0
          Image ID:       [...]
          Port:          7079/TCP
          Host Port:     0/TCP
          Args:
            executor
          State:          Running
            Started:      Wed, 05 Dec 2018 14:56:04 +0100
          Ready:          True
          Restart Count:  0
          Limits:
            memory:  1408Mi
          Requests:
            cpu:     1
            memory:  1408Mi
          Environment:
            SPARK_DRIVER_URL:       spark://CoarseGrainedScheduler@spark-pi-1544018157391-driver-svc.spark-demo.svc.cluster.local:7078
            SPARK_EXECUTOR_CORES:   1
            SPARK_EXECUTOR_MEMORY:  1g
            SPARK_APPLICATION_ID:   spark-application-1544018162183
            SPARK_CONF_DIR:         /opt/spark/conf
            SPARK_EXECUTOR_ID:      2
            SPARK_EXECUTOR_POD_IP:   (v1:status.podIP)
            SPARK_LOCAL_DIRS:       /tmp/spark-local
          Mounts:
            /srv from data (ro)
            /tmp/spark-local from spark-local-dir-1 (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from default-token-5srsx (ro)
      Conditions:
        Type           Status
        Initialized    True
        Ready          True
        PodScheduled   True
      Volumes:
        spark-local-dir-1:
          Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
          Medium:
        data:
          Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
          ClaimName:  nfs-pvc
          ReadOnly:   true
        default-token-5srsx:
          Type:        Secret (a volume populated by a Secret)
          SecretName:  default-token-5srsx
          Optional:    false
      

      I also tried to use hostPath but it reflected the same behavior
      I also reviewed the code which is doing those jobs and tried to find all available parameters, but there are not any parameters available except the subPath options. The Code of executor and driver looks exactly the same from my point of view.

       

      Update

      This behavior is based on using a spark-submit version older than 2.4 and there is no output which is related that you cannot expect this behavior or get warned ! 
      i rebuilt our livy installation based on spark 2.4 containers and the submission from spark 2.4 works fine!

      Attachments

        Activity

          People

            Unassigned Unassigned
            mabunixda Martin Buchleitner
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: