Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38934

Provider TemporaryAWSCredentialsProvider has no credentials

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.1
    • None
    • Kubernetes, Spark Core
    • None

    Description

       

      We are using Jupyter Hub on K8s as a notebook based development environment and Spark on K8s as a backend cluster of Jupyter Hub on K8s with Spark 3.2.1 and Hadoop 3.3.1.

      When we run a code like the one below in the Jupyter Hub on K8s,

       

      val perm = ... // get AWS temporary credential by AWS STS from AWS assumed role
      
      // set AWS temporary credential
      spark.sparkContext.hadoopConfiguration.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider")
      spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", perm.credential.accessKeyID)
      spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", perm.credential.secretAccessKey)
      spark.sparkContext.hadoopConfiguration.set("fs.s3a.session.token", perm.credential.sessionToken)
      
      // execute simple Spark action
      spark.read.format("parquet").load("s3a://<path>/*").show(1) 

       

       

      the first few executors left a warning like the one below in the first code execution, but we were able to get the proper result thanks to Spark task retry function. 

      22/04/18 09:13:50 WARN TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) (10.197.5.15 executor 1): java.nio.file.AccessDeniedException: s3a://<path>/<file>.parquet: org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider TemporaryAWSCredentialsProvider has no credentials
      	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:206)
      	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
      	at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:2810)
      	at org.apache.spark.util.HadoopFSUtils$.listLeafFiles(HadoopFSUtils.scala:225)
      	at org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$6(HadoopFSUtils.scala:136)
      	at scala.collection.immutable.Stream.map(Stream.scala:418)
      	at org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$4(HadoopFSUtils.scala:126)
      	at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:863)
      	at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:863)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
      	at org.apache.spark.scheduler.Task.run(Task.scala:131)
      	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
      	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider TemporaryAWSCredentialsProvider has no credentials
      	at org.apache.hadoop.fs.s3a.auth.AbstractSessionCredentialsProvider.getCredentials(AbstractSessionCredentialsProvider.java:130)
      	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1266)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:842)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:792)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
      	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
      	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5445)
      	at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6420)
      	at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6393)
      	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5430)
      	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5392)
      	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5386)
      	at com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
      	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$7(S3AFileSystem.java:2116)
      	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:489)
      	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:412)
      	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:375)
      	at org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2107)
      	at org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:1750)
      	at org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:62)
      	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
      	... 3 more 

      Would you explain why we are having this warning and tell us how we can prevent  experiencing this issue again?

      Thank you in advance.

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lilyk Lily
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: