Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12496

Update AWS SDK version (1.7.4)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.7.1
    • None
    • fs/s3
    • None

    Description

      hadoop-aws jar still depends on the very old 1.7.4 version of aws-java-sdk.
      In newer versions of SDK, there is incompatible API changes that leads to the following error when trying to use the S3A class and newer versions of sdk presents.
      This is because S3A is calling the method with "int" as the parameter type while the new SDK is expecting "long". This makes it impossible to use kinesis + s3a in the same process.
      It would be very helpful to upgrade hadoop-awas's aws-sdk version.

      java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManagerConfiguration.setMultipartUploadThreshold(I)V
      at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:285)
      at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
      at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
      at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:130)
      at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
      at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
      at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29)
      at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:34)
      at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:36)
      at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
      at $iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
      at $iwC$$iwC$$iwC.<init>(<console>:42)
      at $iwC$$iwC.<init>(<console>:44)
      at $iwC.<init>(<console>:46)
      at <init>(<console>:48)
      at .<init>(<console>:52)
      at .<clinit>(<console>)
      at .<init>(<console>:7)
      at .<clinit>(<console>)
      at $print(<console>)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
      at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
      at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
      at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
      at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
      at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:655)
      at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:620)
      at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:613)
      at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
      at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
      at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
      at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
      at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yongjiaw Yongjia Wang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: