Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
3.2.0
-
None
-
None
Description
I have Spark 3.2.0 driver executing in Kubernetes pod in client mode and following configs has been defined in spark-submit:
--deploy-mode client --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.mount.path=/mnt/distributedDisk --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.readOnly=false --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.glustervol.options.claimName=lolastreamingapp-conf spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.mount.path=/mnt/distributedDisk --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.readOnly=false --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.glustervol.options.claimName=lolastreamingapp
I face a problem when starting the driver pod that it cannot access the filesystem mounted from GlusterFS PVC. I can see that driver pod has not mounted the PVC when describing the pod. I can also see that PVC is not mounted when describing the PVC.
This has been working with Spark version 2.4.x, but not with Spark 3.2.0.
Only notable change we have between using Spark version 2.4.x and 3.2.0 is that in 2.4.x we used deploy-mode cluster and in 3.2.0 we use deploy-mode client.
Because the filesystem used for checkpointing is not mounted properly, we get following kind of error in our application:
java.io.FileNotFoundException: File /mnt/distributedDisk/SE/LolaStreamingApp/1.0.0/1468589949 does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779) ~[hadoop-client-api-3.3.1.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100) ~[hadoop-client-api-3.3.1.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769) ~[hadoop-client-api-3.3.1.jar:?] at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462) ~[hadoop-client-api-3.3.1.jar:?] at org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:240) ~[spark-streaming_2.12-3.2.0.jar:3.2.0] at org.apache.spark.streaming.api.java.JavaStreamingContext.checkpoint(JavaStreamingContext.scala:509) ~[spark-streaming_2.12-3.2.0.jar:3.2.0]