Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-10533

S3 input stream NPEs in MapReduce job

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.0.0, 1.0.3, 2.4.0, 3.0.0-alpha1
    • 2.5.0
    • fs/s3
    • None
    • Hadoop with default configurations

    • mapreduce, s3, mr, hadoop

    Description

      I'm running a wordcount MR as follows

      hadoop jar WordCount.jar wordcount.WordCountDriver s3n://bucket/wordcount/input s3n://bucket/wordcount/output

      s3n://bucket/wordcount/input is a s3 object that contains other input files.

      However I get following NPE error

      12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0%
      12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0%
      12/10/02 18:56:56 INFO mapred.JobClient: Task Id : attempt_201210021853_0001_m_000001_0, Status : FAILED
      java.lang.NullPointerException
      at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
      at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
      at java.io.FilterInputStream.close(FilterInputStream.java:155)
      at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
      at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
      at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:396)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
      at org.apache.hadoop.mapred.Child.main(Child.java:249)

      MR runs fine if i specify more specific input path such as s3n://bucket/wordcount/input/file.txt

      MR fails if I pass s3 folder as a parameter

      In summary,
      This works
      hadoop jar ./hadoop-examples-1.0.3.jar wordcount /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/

      This doesn't work
      hadoop jar ./hadoop-examples-1.0.3.jar wordcount s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/

      (both input path are directories)

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              benkimkimben Benjamin Kim
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: