Hadoop Common
  1. Hadoop Common
  2. HADOOP-5805

problem using top level s3 buckets as input/output directories

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.18.3
    • Fix Version/s: 0.21.0
    • Component/s: fs/s3
    • Labels:
      None
    • Environment:

      ec2, cloudera AMI, 20 nodes

    • Hadoop Flags:
      Reviewed

      Description

      When I specify top level s3 buckets as input or output directories, I get the following exception.

      hadoop jar subject-map-reduce.jar s3n://infocloud-input s3n://infocloud-output

      java.lang.IllegalArgumentException: Path must be absolute: s3n://infocloud-output
      at org.apache.hadoop.fs.s3native.NativeS3FileSystem.pathToKey(NativeS3FileSystem.java:246)
      at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:319)
      at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
      at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:109)
      at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:738)
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)
      at com.evri.infocloud.prototype.subjectmapreduce.SubjectMRDriver.run(SubjectMRDriver.java:63)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at com.evri.infocloud.prototype.subjectmapreduce.SubjectMRDriver.main(SubjectMRDriver.java:25)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
      at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
      at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

      The workaround is to specify input/output buckets with sub-directories:

      hadoop jar subject-map-reduce.jar s3n://infocloud-input/input-subdir s3n://infocloud-output/output-subdir

      1. HADOOP-5805-0.patch
        1 kB
        Ian Nowland
      2. HADOOP-5805-1.patch
        1 kB
        Ian Nowland
      3. HADOOP-5805-2.patch
        1 kB
        Tom White

        Activity

          People

          • Assignee:
            Ian Nowland
            Reporter:
            Arun Jacob
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development