Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28921

Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0, 2.3.1, 2.3.3, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4
    • 2.4.5, 3.0.0
    • Kubernetes, Spark Core
    • None

    Description

      Spark jobs are failing on latest versions of Kubernetes when jobs attempt to provision executor pods (jobs like Spark-Pi that do not launch executors run without a problem):

       

      Here's an example error message:

       

      19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes.
      19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: HTTP 403, Status: 403 - 
      java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' 
          at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) 
          at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) 
          at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) 
          at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) 
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
          at java.lang.Thread.run(Thread.java:748)
      

       

      Looks like the issue is caused by fixes for a recent CVE : 

      CVE: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809

      Fix: https://github.com/fabric8io/kubernetes-client/pull/1669

       

      Looks like upgrading kubernetes-client to 4.4.2 would solve this issue.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            andygrove Andy Grove
            psschwei Paul Schweigert
            Votes:
            2 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment