Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-2080

Hadoop Conf not propagated from driver to executor in S3

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.4.0
    • 1.4.0
    • spark-integration
    • None
    • Spark 2.1, Hadoop 2.7.2 with 3 node cluster using Mesos

    Description

      On loading data in distributed environment using S3 as location. The load fails because of not getting hadoop conf on executors.

      Logs Info : 
      18/01/24 07:38:20 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 7, hadoop-slave-1, executor 1): com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
      at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
      at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3521)
      at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
      at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
      at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
      at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
      at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
      at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:67)
      at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.<init>(AbstractDFSCarbonFile.java:59)
      at org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile.<init>(HDFSCarbonFile.java:42)
      at org.apache.carbondata.core.datastore.impl.DefaultFileTypeProvider.getCarbonFile(DefaultFileTypeProvider.java:47)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getCarbonFile(FileFactory.java:86)
      at org.apache.carbondata.core.indexstore.blockletindex.SegmentIndexFileStore.getCarbonIndexFiles(SegmentIndexFileStore.java:204)
      at org.apache.carbondata.core.writer.CarbonIndexFileMergeWriter.mergeCarbonIndexFilesOfSegment(CarbonIndexFileMergeWriter.java:52)
      at org.apache.carbondata.core.writer.CarbonIndexFileMergeWriter.mergeCarbonIndexFilesOfSegment(CarbonIndexFileMergeWriter.java:119)
      at org.apache.carbondata.spark.rdd.CarbonMergeFilesRDD$$anon$1.<init>(CarbonMergeFilesRDD.scala:58)
      at org.apache.carbondata.spark.rdd.CarbonMergeFilesRDD.internalCompute(CarbonMergeFilesRDD.scala:53)
      at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
      at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
      at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
      at org.apache.spark.scheduler.Task.run(Task.scala:99)

      Attachments

        Issue Links

          Activity

            People

              Jatin Demla Jatin
              Jatin Demla Jatin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h
                  5h