Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16932

distcp copy calls getFileStatus() needlessly and can fail against S3

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.0.0, 3.1.2, 3.2.1
    • Fix Version/s: 3.3.0
    • Component/s: fs/s3, tools/distcp
    • Labels:
      None
    • Environment:

      Hadoop CDH 6.3

      Description

      Distcp to AWS s3 was working fine on CDH 5.16 with distcp 2.6.0. but after upgrade to CDH 6.3 which comes with distcp-3.0 JAR which is through error as below.

      The same error with repeats on Hadoop-distcp-3.2.1.jar as well. Tried with -direct option in 3.2.1, still same error.

       

      Error: java.io.FileNotFoundException: No such file or directory: s3a://XXXXXXXXXXXXX/part-00012-baa6a706-3816-4dfa-ba07-0fb56fd38178-c000.snappy.parquet
      at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2255)
      at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
      at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
      at org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:203)
      at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:220)
      at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                stevel@apache.org Steve Loughran
                Reporter:
                thangamani.murugasamy@epsilon.com Thangamani Murugasamy
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: