Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-968

NPE in distcp encountered when placing _logs directory on S3FileSystem

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.21.0
    • Component/s: distcp
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If distcp is pointed to an empty S3 bucket as the destination for an s3:// filesystem transfer, it will fail with the following exception

      Copy failed: java.lang.NullPointerException
      at org.apache.hadoop.fs.s3.S3FileSystem.makeAbsolute(S3FileSystem.java:121)
      at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:332)
      at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:633)
      at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1005)
      at org.apache.hadoop.tools.DistCp.copy(DistCp.java:650)
      at org.apache.hadoop.tools.DistCp.run(DistCp.java:857)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
      at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)

      1. MAPREDUCE-968.patch
        0.8 kB
        Aaron Kimball

        Activity

        Hide
        Aaron Kimball added a comment -

        This patch fixes the issue. If the destination directory is '/' and doesn't exist, it will fall into a case where Path.getParent() is used to compute the _logs target directory name. This returns null if the Path is '/'. In this special case, the '/' directory needs to be created by distcp too.

        No unit test because this requires creating S3 buckets. I manually tested this by creating an empty S3 bucket and running:

        bin/hadoop distcp some-hdfs-dir s3://<access-key>:<secret-key>@my-new-bucket/
        

        This failed with the NPE. After the patch, this succeeded. Confirmed that file uploads worked via

        bin/hadoop fs -ls s3://<access-key>:<secret-key>@my-new-bucket/
        
        Show
        Aaron Kimball added a comment - This patch fixes the issue. If the destination directory is '/' and doesn't exist, it will fall into a case where Path.getParent() is used to compute the _logs target directory name. This returns null if the Path is '/'. In this special case, the '/' directory needs to be created by distcp too. No unit test because this requires creating S3 buckets. I manually tested this by creating an empty S3 bucket and running: bin/hadoop distcp some-hdfs-dir s3: //<access-key>:<secret-key>@my- new -bucket/ This failed with the NPE. After the patch, this succeeded. Confirmed that file uploads worked via bin/hadoop fs -ls s3: //<access-key>:<secret-key>@my- new -bucket/
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12419220/MAPREDUCE-968.patch
        against trunk revision 813585.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419220/MAPREDUCE-968.patch against trunk revision 813585. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/25/console This message is automatically generated.
        Hide
        Aaron Kimball added a comment -

        Test failure is unrelated.

        Show
        Aaron Kimball added a comment - Test failure is unrelated.
        Hide
        Tom White added a comment -

        +1

        I've just committed this. Thanks Aaron!

        Show
        Tom White added a comment - +1 I've just committed this. Thanks Aaron!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #35 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/35/)
        . NPE in distcp encountered when placing _logs directory on S3FileSystem. Contributed by Aaron Kimball.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #35 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/35/ ) . NPE in distcp encountered when placing _logs directory on S3FileSystem. Contributed by Aaron Kimball.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #83 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/83/)
        . NPE in distcp encountered when placing _logs directory on S3FileSystem. Contributed by Aaron Kimball.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #83 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/83/ ) . NPE in distcp encountered when placing _logs directory on S3FileSystem. Contributed by Aaron Kimball.
        Hide
        François Le Lay added a comment -

        Hi, I am encountering this issue in 1.0.4

        hadoop@ip-172-21-1-200 ~]$ hadoop distcp -i hdfs://172.21.1.200:9000/user s3://my-empty-bucket/
        Warning: $HADOOP_HOME is deprecated.
        
        13/10/26 14:08:11 INFO tools.DistCp: srcPaths=[hdfs://172.21.1.200:9000/user]
        13/10/26 14:08:11 INFO tools.DistCp: destPath=s3://my-empty-bucket/
        With failures, global counters are inaccurate; consider running with -i
        Copy failed: java.lang.NullPointerException
        	at org.apache.hadoop.fs.s3.S3FileSystem.makeAbsolute(S3FileSystem.java:121)
        	at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:332)
        	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768)
        	at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1048)
        	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
        	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
        	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
        
        Show
        François Le Lay added a comment - Hi, I am encountering this issue in 1.0.4 hadoop@ip-172-21-1-200 ~]$ hadoop distcp -i hdfs: //172.21.1.200:9000/user s3://my-empty-bucket/ Warning: $HADOOP_HOME is deprecated. 13/10/26 14:08:11 INFO tools.DistCp: srcPaths=[hdfs: //172.21.1.200:9000/user] 13/10/26 14:08:11 INFO tools.DistCp: destPath=s3: //my-empty-bucket/ With failures, global counters are inaccurate; consider running with -i Copy failed: java.lang.NullPointerException at org.apache.hadoop.fs.s3.S3FileSystem.makeAbsolute(S3FileSystem.java:121) at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:332) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768) at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1048) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

          People

          • Assignee:
            Aaron Kimball
            Reporter:
            Aaron Kimball
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development