Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12689

S3 filesystem operations stopped working correctly

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.0
    • 2.8.0, 3.0.0-alpha1
    • tools

    Description

      HADOOP-10542 was resolved by replacing "return null;" with throwing IOException. This causes several S3 filesystem operations to fail (possibly more code is expecting that null return value; these are just the calls I noticed):

      S3FileSystem.getFileStatus() (which no longer raises FileNotFoundException but instead IOException)
      FileSystem.exists() (which no longer returns false but instead raises IOException)
      S3FileSystem.create() (which no longer succeeds but instead raises IOException)

      Run command:

      hadoop distcp hdfs://localhost:9000/test s3://xxx:yyy@com.bar.foo/

      Resulting stack trace:

      2015-12-11 10:04:34,030 FATAL [IPC Server handler 6 on 44861] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1449826461866_0005_m_000006_0 - exited : java.io.IOException: /test doesn't exist
      at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:170)
      at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.retrieveINode(Jets3tFileSystemStore.java:221)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
      at com.sun.proxy.$Proxy17.retrieveINode(Unknown Source)
      at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:340)
      at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:230)
      at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

      changing the "raise IOE..." to "return null" fixes all of the above code sites and allows distcp to succeed.

      Attachments

        1. HADOOP-12689.01.patch
          1 kB
          Matthew Paduano

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mattpaduano Matthew Paduano
            mattpaduano Matthew Paduano
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment