Hadoop Common
  1. Hadoop Common
  2. HADOOP-5836

Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.18.3
    • Fix Version/s: 1.1.0, 0.21.0
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Some tools which upload to S3 and use a object terminated with a "/" as a directory marker, for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(), then the current code will return an empty file "", which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail.

      1. HADOOP-5836-0-for_branch_1_0.patch
        27 kB
        Jagane Sundar
      2. HADOOP-5836-2.patch
        28 kB
        Ian Nowland
      3. HADOOP-5836-0.patch
        27 kB
        Ian Nowland

        Activity

        Hide
        Ian Nowland added a comment -

        The main fix here is to check for and just not return this empty file in listStatus(). However along with this, I broadened handling in all S3N methods for the different ways of designating directories in S3, in this way:

        • A note about directories. S3 of course has no "native" support for them.
        • The idiom we choose then is: for any directory created by this class,
        • we use an empty object "# {dirpath}_$folder$" as a marker.
          * Further, to interoperate with other S3 tools, we also accept the following:
          * - an object "#{dirpath}

          /' denoting a directory marker

        • - if there exists any objects with the prefix "# {dirpath}

          /", then the

        • directory is said to exist
        • - if both a file with the name of a directory and a marker for that
        • directory exists, then the file masks the directory, and the directory
        • is never returned.

        In particular this meant fixing delete() and rename() to handle all three possible meanings of directory without failing.

        This patch also includes the following:

        • Add logging any time a file in S3 is accessed for read or write, so when you get failure accessing/using a file its name will be in the task log
        • Fix when opening a file for reading which doesn't exist, change the behavior to immediately throw a FileNotFoundException, rather than returning a hard to debug NPE later when the file is closed.
        • Rewrite rename so that it only deletes the source files after every destination file has been written, so you never end up with half the files in each location
        • Set up retryer so rename automatically retries on S3 errors.
        Show
        Ian Nowland added a comment - The main fix here is to check for and just not return this empty file in listStatus(). However along with this, I broadened handling in all S3N methods for the different ways of designating directories in S3, in this way: A note about directories. S3 of course has no "native" support for them. The idiom we choose then is: for any directory created by this class, we use an empty object "# {dirpath}_$folder$" as a marker. * Further, to interoperate with other S3 tools, we also accept the following: * - an object "#{dirpath} /' denoting a directory marker - if there exists any objects with the prefix "# {dirpath} /", then the directory is said to exist - if both a file with the name of a directory and a marker for that directory exists, then the file masks the directory , and the directory is never returned. In particular this meant fixing delete() and rename() to handle all three possible meanings of directory without failing. This patch also includes the following: Add logging any time a file in S3 is accessed for read or write, so when you get failure accessing/using a file its name will be in the task log Fix when opening a file for reading which doesn't exist, change the behavior to immediately throw a FileNotFoundException, rather than returning a hard to debug NPE later when the file is closed. Rewrite rename so that it only deletes the source files after every destination file has been written, so you never end up with half the files in each location Set up retryer so rename automatically retries on S3 errors.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12408154/HADOOP-5836-0.patch
        against trunk revision 777330.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 11 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/376/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408154/HADOOP-5836-0.patch against trunk revision 777330. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/376/console This message is automatically generated.
        Hide
        Tom White added a comment -

        These changes look good. A few comments

        • Have you run Jets3tNativeS3FileSystemContractTest? This isn't run by default since it needs an S3 account to test with. This serves as a good regression test.
        • There's a mixture of debug-level and info-level debugging here. How noisy is this in practice? Shouldn't it be mainly debug, so folks can enable it when they hit problems?
        • Some of the indentation looks wrong in the patch - e.g. in handleServiceException(S3ServiceException).
        • The patch doesn't apply cleanly anymore and needs regenerating.
        Show
        Tom White added a comment - These changes look good. A few comments Have you run Jets3tNativeS3FileSystemContractTest? This isn't run by default since it needs an S3 account to test with. This serves as a good regression test. There's a mixture of debug-level and info-level debugging here. How noisy is this in practice? Shouldn't it be mainly debug, so folks can enable it when they hit problems? Some of the indentation looks wrong in the patch - e.g. in handleServiceException(S3ServiceException). The patch doesn't apply cleanly anymore and needs regenerating.
        Hide
        Ian Nowland added a comment -
        • Yes, I have run Jets3tNativeS3FileSystemContractTest. Multiple Times in fact, including for the newest patch .
        • I have reworked logging, making everything debug, except the following:
          + LOG.info("Opening key '" + key + "' for reading at position '" + pos + "'");
          + LOG.info("OutputStream for key '" + key + "' writing to tempfile '" + this.backupFile + "'");
          + LOG.info("OutputStream for key '" + key + "' closed. Now beginning upload");
          + LOG.info("OutputStream for key '" + key + "' upload complete");
          + LOG.info("Opening '" + f + "' for reading");

        The basic idea is I want to always capture in a tasks syslog what S3 files it is reading from as this is very useful when a subset of tasks fail. Also I wanted to capture the time spent in actually uploading the file to S3 very specifically.

        • Good catch - must have happened as part of the diff I did ignoring whitespace. I have now gone through with a fine tooth comb and fixed all indentation issues I could see.
        • Done
        Show
        Ian Nowland added a comment - Yes, I have run Jets3tNativeS3FileSystemContractTest. Multiple Times in fact, including for the newest patch . I have reworked logging, making everything debug, except the following: + LOG.info("Opening key '" + key + "' for reading at position '" + pos + "'"); + LOG.info("OutputStream for key '" + key + "' writing to tempfile '" + this.backupFile + "'"); + LOG.info("OutputStream for key '" + key + "' closed. Now beginning upload"); + LOG.info("OutputStream for key '" + key + "' upload complete"); + LOG.info("Opening '" + f + "' for reading"); The basic idea is I want to always capture in a tasks syslog what S3 files it is reading from as this is very useful when a subset of tasks fail. Also I wanted to capture the time spent in actually uploading the file to S3 very specifically. Good catch - must have happened as part of the diff I did ignoring whitespace. I have now gone through with a fine tooth comb and fixed all indentation issues I could see. Done
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12409198/HADOOP-5836-1.patch
        against trunk revision 780114.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 11 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/430/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409198/HADOOP-5836-1.patch against trunk revision 780114. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/430/console This message is automatically generated.
        Hide
        Ian Nowland added a comment -

        Regenerating against trunk.

        Show
        Ian Nowland added a comment - Regenerating against trunk.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12409482/HADOOP-5836-2.patch
        against trunk revision 780465.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 10 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409482/HADOOP-5836-2.patch against trunk revision 780465. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/443/console This message is automatically generated.
        Hide
        Tom White added a comment -

        I've just committed this. Thanks Ian!

        Show
        Tom White added a comment - I've just committed this. Thanks Ian!
        Hide
        Jagane Sundar added a comment -

        Re-spin of this patch for branch-1.0. Code is unchanged.

        Show
        Jagane Sundar added a comment - Re-spin of this patch for branch-1.0. Code is unchanged.
        Hide
        Jagane Sundar added a comment -

        A note on running the unit test for this (and possibly other s3/s3n driver patches):
        The name of the unit test is Jets3tNativeS3FileSystemContractTest and it is run using the command:

        $ ant -Dtestcase=Jets3tNativeS3FileSystemContractTest test-core

        To run this unit test, you need to add your own S3 credentials to the file src/test/core-site.xml. Edit src/test/core-site.xml and add details of a test s3 and a test s3n filesystems for the test, as shown below. In this example, you need to replace AAAAAAAAAAAAAAAAAAA:bbbbbbbbbbbbbbbbbbbbb with urlencoded versions of your S3 access key id, and secret access key. Use the program urlencode on linux to urlencode.

        <property>
        <name>test.fs.s3n.name</name>
        <value>s3n://AAAAAAAAAAAAAAAAAAA:bbbbbbbbbbbbbbbbbbbbb@yourbucketname</value>
        <description>The name of the s3n file system for testing.</description>
        </property>

        <property>
        <name>test.fs.s3.name</name>
        <value>s3://AAAAAAAAAAAAAAAAAAA:bbbbbbbbbbbbbbbbbbbbb@yourbucketname</value>
        <description>The name of the s3 file system for testing.</description>
        </property>

        Show
        Jagane Sundar added a comment - A note on running the unit test for this (and possibly other s3/s3n driver patches): The name of the unit test is Jets3tNativeS3FileSystemContractTest and it is run using the command: $ ant -Dtestcase=Jets3tNativeS3FileSystemContractTest test-core To run this unit test, you need to add your own S3 credentials to the file src/test/core-site.xml. Edit src/test/core-site.xml and add details of a test s3 and a test s3n filesystems for the test, as shown below. In this example, you need to replace AAAAAAAAAAAAAAAAAAA:bbbbbbbbbbbbbbbbbbbbb with urlencoded versions of your S3 access key id, and secret access key. Use the program urlencode on linux to urlencode. <property> <name>test.fs.s3n.name</name> <value>s3n://AAAAAAAAAAAAAAAAAAA:bbbbbbbbbbbbbbbbbbbbb@yourbucketname</value> <description>The name of the s3n file system for testing.</description> </property> <property> <name>test.fs.s3.name</name> <value>s3://AAAAAAAAAAAAAAAAAAA:bbbbbbbbbbbbbbbbbbbbb@yourbucketname</value> <description>The name of the s3 file system for testing.</description> </property>

          People

          • Assignee:
            Ian Nowland
            Reporter:
            Ian Nowland
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development