Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.3.0
-
None
-
None
Description
Spark-submit ( cluster mode on Kubernetes ) results error io.fabric8.kubernetes.client.KubernetesClientException on my 3 nodes k8s cluster.
Steps followed:
- using IBM cloud, created 3 Instances
- 1st Instance act as master node and another two acts as worker nodes
root@vsi-spark-master:/opt# kubectl get nodes NAME STATUS ROLES AGE VERSION vsi-spark-master Ready control-plane,master 2d v1.27.3+k3s1 vsi-spark-worker-1 Ready <none> 47h v1.27.3+k3s1 vsi-spark-worker-2 Ready <none> 47h v1.27.3+k3s1
- Copy spark-3.4.1-bin-hadoop3.tgz in to /opt/spark folder
- Ran spark by using below command
root@vsi-spark-master:/opt# /opt/spark/bin/spark-submit --master k8s://http://<master_node_IP>:6443 --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=5 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark --conf spark.kubernetes.container.image=sushmakorati/testrepo:pyrandomGB local:///opt/spark/examples/jars/spark-examples_2.12-3.4.1.jar
- And getting below error message.
3/07/27 12:56:26 WARN Utils: Kubernetes master URL uses HTTP instead of HTTPS. 23/07/27 12:56:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/07/27 12:56:26 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file 23/07/27 12:56:26 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image. 23/07/27 12:56:27 ERROR Client: Please check "kubectl auth can-i create pod" first. It should be yes. Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:129) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:122) at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:44) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1113) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:93) at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:250) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:244) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2786) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:244) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:216) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.IOException: Connection reset at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:535) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:558) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:349) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:711) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:93) at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42) ... 15 more Caused by: java.net.SocketException: Connection reset at java.base/java.net.SocketInputStream.read(SocketInputStream.java:186) at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140) at okio.Okio$2.read(Okio.java:140) at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) at okio.RealBufferedSource.read(RealBufferedSource.java:47) at okhttp3.internal.http1.Http1Codec$AbstractSource.read(Http1Codec.java:363) at okhttp3.internal.http1.Http1Codec$UnknownLengthSource.read(Http1Codec.java:507) at okio.RealBufferedSource.exhausted(RealBufferedSource.java:57) at io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:127) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) 23/07/27 12:56:27 INFO ShutdownHookManager: Shutdown hook called 23/07/27 12:56:27 INFO ShutdownHookManager: Deleting directory /tmp/spark-70ee50ef-d9e9-4220-91f4-15a282031095