Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12979

IOE in S3a: ${hadoop.tmp.dir}/s3a not configured

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0
    • Component/s: None
    • Labels:
      None

      Description

      Running some spark s3a tests trigger an NPE in Hadoop <=2/7l IOE in 2.8 saying

      ${hadoop.tmp.dir}/s3a not configured.
      

      That's correct: there is no configuration option on the conf called

      ${hadoop.tmp.dir}/s3a
      

      There may be one called hadoop.tmp.dir, however.

      Essentially s3a is sending the wrong config option down, if it can't find fs.s3a.buffer.dir

        Issue Links

          Activity

          Hide
          stevel@apache.org Steve Loughran added a comment -

          The problem is the branch take if there's no s3 buffer dir defined. the else clause is broken; it's looking for a config option which isn't there.

              if (conf.get(BUFFER_DIR, null) != null) {
                lDirAlloc = new LocalDirAllocator(BUFFER_DIR);
              } else {
                lDirAlloc = new LocalDirAllocator("${hadoop.tmp.dir}/s3a");   // HERE
              }
          

          The fix should be to set BUFFER_DIR to the full path desired, create the LocalDirAllocator(BUFFER_DIR) from the (possibly enhanced) config

          Show
          stevel@apache.org Steve Loughran added a comment - The problem is the branch take if there's no s3 buffer dir defined. the else clause is broken; it's looking for a config option which isn't there. if (conf.get(BUFFER_DIR, null ) != null ) { lDirAlloc = new LocalDirAllocator(BUFFER_DIR); } else { lDirAlloc = new LocalDirAllocator( "${hadoop.tmp.dir}/s3a" ); // HERE } The fix should be to set BUFFER_DIR to the full path desired, create the LocalDirAllocator(BUFFER_DIR) from the (possibly enhanced) config
          Hide
          stevel@apache.org Steve Loughran added a comment -

          full stack

          Driver stacktrace:
            at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1457)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1445)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1444)
            at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
            at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
            at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1444)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:809)
            at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:809)
            at scala.Option.foreach(Option.scala:257)
            at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:809)
            ...
            Cause: java.io.IOException: ${hadoop.tmp.dir}/s3a not configured
            at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:269)
            at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:349)
            at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:421)
            at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:198)
            at org.apache.hadoop.fs.s3a.S3AOutputStream.<init>(S3AOutputStream.java:91)
            at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:488)
            at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:921)
            at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:814)
            at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123)
            at org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
          
          Show
          stevel@apache.org Steve Loughran added a comment - full stack Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1457) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1445) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1444) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1444) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:809) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:809) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:809) ... Cause: java.io.IOException: ${hadoop.tmp.dir}/s3a not configured at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:269) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:349) at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:421) at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:198) at org.apache.hadoop.fs.s3a.S3AOutputStream.<init>(S3AOutputStream.java:91) at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:488) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:921) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:814) at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123) at org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)

            People

            • Assignee:
              stevel@apache.org Steve Loughran
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development