Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38662

Spark looses k8s auth after some time

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.1
    • None
    • Kubernetes
    • None

    Description

      Spark starts to fail with error listed below after some time of working:

      [2022-03-25 17:11:12,706] INFO  (Logging.scala:57) - Adding decommission script to lifecycle                                                                                                                       
      [2022-03-25 17:11:12,712] WARN  (Logging.scala:90) - Exception when notifying snapshot subscriber.                                                                                                                 
      io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://cluster_endpoint/api/v1/namespaces/spark/pods. Message: Unauthorized! Token may have expired! Please log-in again. Unauth
      orized.                                                                                                                                                                                                            
              at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:639)                                                                                                        
              at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:576)
              at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:543)
              at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:504)
              at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:292) 
              at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:893)
              at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:372)
              at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:86)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:400)
              at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:382)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$36(ExecutorPodsAllocator.scala:346)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$36$adapted(ExecutorPodsAllocator.scala:339)
              at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
              at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
              at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:339)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:117)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:117)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.org$apache$spark$scheduler$cluster$k8s$ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber$$processSnapshotsInt
      ernal(ExecutorPodsSnapshotsStoreImpl.scala:138)     
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.processSnapshots(ExecutorPodsSnapshotsStoreImpl.scala:126)
              at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl.$anonfun$addSubscriber$1(ExecutorPodsSnapshotsStoreImpl.scala:81)
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
              at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
              at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
              at java.base/java.lang.Thread.run(Thread.java:834)

      This doesn't reproduce on 3.1.1 with the same configs, environment and workload.

      Attachments

        Activity

          People

            Unassigned Unassigned
            dmn42 Alex
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: