Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.11.2, 0.12.0
    • Fix Version/s: 0.13.0
    • Component/s: fs
    • Labels:
      None

      Description

      I had problem with the -ls command in s3 file system. It was returning inconsistence number of "Found Items" if you rerun it different times and more importantly it returns recursive results (depth 1) for some folders.

      I looked into the code, the problem is caused by jets3t library. The inconsistency problem will be solved if we use :
      S3Object[] objects = s3Service.listObjects(bucket, prefix, PATH_DELIMITER);

      instead of

      S3Object[] objects = s3Service.listObjects(bucket, prefix, PATH_DELIMITER , 0);

      in listSubPaths of Jets3tFileSystemStore class (line 227)! This change will let GET REST request to have a "max-key" paramter with default value of 1000! It seems s3 GET request is sensetive to this paramater!

      But, the recursive problem is because the GET request doesn't execute the delimiter constraint correctly. The response contains all the keys with the given prefix but they don't stop at the path_delimiter. You can simply test this by making couple folder on hadoop s3 filesystem and run -ls. I followed the generated GET request and it looks all fine but it is not executed correctly at the s3 server side.I still don't know why the response doesn't stop at the path_delimiter.

      Possible casue: Jets3t library does URL encoding, why do we need to do URL encoding in Jets3tFileSystemStore class!?

      example:

      Original path is /user/root/folder and it will be encoded to %2Fuser%2Froot%2Ffolder is Jets3tFileSystemStore class. Then, Jets3t will reencode this to make the REST request. And it will be rewritten as %252Fuser%252Froot%252Ffolder, so the the generated folder on the S3 will be %2Fuser%2Froot%2Ffolder after decoding at the amazon side. Wouldn't be better to skip the encoding part on Hadoop. This strange structure might be the reason that the s3 doesn't stop at the path_delimiter.

      1. HADOOP-1061-v4.patch
        21 kB
        Tom White
      2. hadoop-1061-v3.patch
        8 kB
        Tom White
      3. hadoop-1061-v2.patch
        6 kB
        Tom White
      4. 1061-hadoop.patch
        0.8 kB
        Mike Smith

        Activity

        Hide
        Mike Smith added a comment -

        As a temporary solution I replaced the listSubPath method of Jets3tFileSystemStore class with this new mathod, which simply check if there is any tail after delimiter and also does not pass zero to s3Service.listObjects:

        public Set<Path> listSubPaths(Path path) throws IOException {
        try {
        String prefix = pathToKey(path);
        if (!prefix.endsWith(PATH_DELIMITER))

        { prefix += PATH_DELIMITER; }

        S3Object[] objects = s3Service.listObjects(bucket, prefix, PATH_DELIMITER);
        Set<Path> prefixes = new TreeSet<Path>();
        for (int i = 0; i < objects.length; i++)

        { if(objects[i].getKey().replaceAll(prefix,"").indexOf(PATH_DELIMITER)<0) prefixes.add(keyToPath(objects[i].getKey())); }

        prefixes.remove(path);
        return prefixes;
        } catch (S3ServiceException e) {
        if (e.getCause() instanceof IOException)

        { throw (IOException) e.getCause(); }

        throw new S3Exception(e);
        }
        }

        Show
        Mike Smith added a comment - As a temporary solution I replaced the listSubPath method of Jets3tFileSystemStore class with this new mathod, which simply check if there is any tail after delimiter and also does not pass zero to s3Service.listObjects: public Set<Path> listSubPaths(Path path) throws IOException { try { String prefix = pathToKey(path); if (!prefix.endsWith(PATH_DELIMITER)) { prefix += PATH_DELIMITER; } S3Object[] objects = s3Service.listObjects(bucket, prefix, PATH_DELIMITER); Set<Path> prefixes = new TreeSet<Path>(); for (int i = 0; i < objects.length; i++) { if(objects[i].getKey().replaceAll(prefix,"").indexOf(PATH_DELIMITER)<0) prefixes.add(keyToPath(objects[i].getKey())); } prefixes.remove(path); return prefixes; } catch (S3ServiceException e) { if (e.getCause() instanceof IOException) { throw (IOException) e.getCause(); } throw new S3Exception(e); } }
        Hide
        Tom White added a comment -

        Mike,

        I agree we should use the listObjects call that doesn't take a maxListingLength. JetS3t retrieves them in batches of 1000 behind the scenes for us. (Note that listDeepSubPaths uses the correct form of the method.)

        Does dropping the URL encoding in Jets3tFileSystemStore cause any problems with root ("/") - I seem to remember it may have. However, it's wrong to double encode, so I would suggest removing the encoding from Jets3tFileSystemStore and seeing if the tests pass. If they don't we should go from there (rather than using double encoding because it happens to work).

        Would you be able to create a patch for this? Thanks!

        Show
        Tom White added a comment - Mike, I agree we should use the listObjects call that doesn't take a maxListingLength. JetS3t retrieves them in batches of 1000 behind the scenes for us. (Note that listDeepSubPaths uses the correct form of the method.) Does dropping the URL encoding in Jets3tFileSystemStore cause any problems with root ("/") - I seem to remember it may have. However, it's wrong to double encode, so I would suggest removing the encoding from Jets3tFileSystemStore and seeing if the tests pass. If they don't we should go from there (rather than using double encoding because it happens to work). Would you be able to create a patch for this? Thanks!
        Hide
        Mike Smith added a comment -

        Tom,

        I made the patch that fixes the listSubPaths. Also, I removed the enconding from Jets3tFileSystemStore and there was no problem and it worked fine and there was no problem with root '/'. But, I didn't add that in the patch because that change is not backward compatible. If someone removes the encoding, then the previous built files are not readable anymore. It would be nice to make a tool to migrate the double encoded files to single '/' files when we make a patch to remove encoding from Jets3tFileSystemStore .

        Show
        Mike Smith added a comment - Tom, I made the patch that fixes the listSubPaths. Also, I removed the enconding from Jets3tFileSystemStore and there was no problem and it worked fine and there was no problem with root '/'. But, I didn't add that in the patch because that change is not backward compatible. If someone removes the encoding, then the previous built files are not readable anymore. It would be nice to make a tool to migrate the double encoded files to single '/' files when we make a patch to remove encoding from Jets3tFileSystemStore .
        Hide
        Tom White added a comment -

        S3 files need to be written with a version number of the client library that wrote them (as suggested in HADOOP-930). If we did this now, then we can detect when there is a mismatch and fail fast (and informatively). While it would be possible (but tricky) to support both versions, I don't feel that we should do that since there is a workaround for data migration: copy your S3 data from the old file system to a local file or HDFS (on EC2 preferably, but this isn't necessary) using the old version of Hadoop, then copy it back to a new S3 file system using a new version of Hadoop. I'd be happy to write this.

        (I'm not saying that version-aware code will never be needed, just that it isn't yet since not that many people are using this feature yet.)

        Thoughts?

        Show
        Tom White added a comment - S3 files need to be written with a version number of the client library that wrote them (as suggested in HADOOP-930 ). If we did this now, then we can detect when there is a mismatch and fail fast (and informatively). While it would be possible (but tricky) to support both versions, I don't feel that we should do that since there is a workaround for data migration: copy your S3 data from the old file system to a local file or HDFS (on EC2 preferably, but this isn't necessary) using the old version of Hadoop, then copy it back to a new S3 file system using a new version of Hadoop. I'd be happy to write this. (I'm not saying that version-aware code will never be needed, just that it isn't yet since not that many people are using this feature yet.) Thoughts?
        Hide
        Doug Cutting added a comment -

        We could start adding the version number property with this patch, so that one might write an automatic migration tool later. Adding the version number ASAP is a good idea anyway.

        Show
        Doug Cutting added a comment - We could start adding the version number property with this patch, so that one might write an automatic migration tool later. Adding the version number ASAP is a good idea anyway.
        Hide
        Tom White added a comment -

        Yes, the patch should include

        • Fixing listObjects call
        • Removing URL encoding.
        • Adding version numbering metadata to files. This should use the S3 metadata feature so we can detect mismatches (and throw an exception) before reading the data.
        • Adding filesystem type metadata to files. Again using S3 metadata. This is easy to do at the same time as version and is a pre-requisite for HADOOP-930.

        I'll have a go at this.

        Show
        Tom White added a comment - Yes, the patch should include Fixing listObjects call Removing URL encoding. Adding version numbering metadata to files. This should use the S3 metadata feature so we can detect mismatches (and throw an exception) before reading the data. Adding filesystem type metadata to files. Again using S3 metadata. This is easy to do at the same time as version and is a pre-requisite for HADOOP-930 . I'll have a go at this.
        Hide
        Tom White added a comment -

        Patch for review.

        Show
        Tom White added a comment - Patch for review.
        Hide
        Doug Cutting added a comment -

        Should we check not just the FS version, but also the name and type?

        Also, to be clear, this is not back-compatible, right? Is that what we want? Perhaps instead, if something has none of the metadata properties set, we could interpret it as the prior version, for back-compatibility, so folks can still read their S3 data. We might later drop that, as long-term, we might make it so that objects with none of these properties set are treated as simple HTTP resources.

        Show
        Doug Cutting added a comment - Should we check not just the FS version, but also the name and type? Also, to be clear, this is not back-compatible, right? Is that what we want? Perhaps instead, if something has none of the metadata properties set, we could interpret it as the prior version, for back-compatibility, so folks can still read their S3 data. We might later drop that, as long-term, we might make it so that objects with none of these properties set are treated as simple HTTP resources.
        Hide
        Tom White added a comment -

        No this is not back-compatible. It's not actually that hard to make it back-compatible - so I'll produce another patch. I'll check for name and type too.

        (I was thinking a similar thing about the long-term support for HTTP.)

        Show
        Tom White added a comment - No this is not back-compatible. It's not actually that hard to make it back-compatible - so I'll produce another patch. I'll check for name and type too. (I was thinking a similar thing about the long-term support for HTTP.)
        Hide
        Mike Smith added a comment -

        Tom,

        I checked the second patch and it works fine and detects the version mismatch. Also, it seems "not stopping at the path_delimiter" problem is fixed after removing double encoding.

        Show
        Mike Smith added a comment - Tom, I checked the second patch and it works fine and detects the version mismatch. Also, it seems "not stopping at the path_delimiter" problem is fixed after removing double encoding.
        Hide
        Tom White added a comment -

        > It's not actually that hard to make it back-compatible

        It's a bit harder than I thought when I wrote this, since we have changed the key format. So, you would have to check for both forms of the key before before saying the file or directory doesn't exist. I feel this would complicate the code somewhat.

        I would suggest going with this new patch (v3) which additionally checks name and type. People will have to migrate old data as I originally described above. If this proved problematic, or there was demand, we could write a migration script. (the benefit of this is that it keeps the core S3FileSystem code unencumbered by version migration logic.)

        Show
        Tom White added a comment - > It's not actually that hard to make it back-compatible It's a bit harder than I thought when I wrote this, since we have changed the key format. So, you would have to check for both forms of the key before before saying the file or directory doesn't exist. I feel this would complicate the code somewhat. I would suggest going with this new patch (v3) which additionally checks name and type. People will have to migrate old data as I originally described above. If this proved problematic, or there was demand, we could write a migration script. (the benefit of this is that it keeps the core S3FileSystem code unencumbered by version migration logic.)
        Hide
        Mike Smith added a comment -

        Thanks Tom,

        I wondering if you've got a chance to write any migration script. I do have some data on the old scheme that I cannot afford to lose.

        Show
        Mike Smith added a comment - Thanks Tom, I wondering if you've got a chance to write any migration script. I do have some data on the old scheme that I cannot afford to lose.
        Hide
        Tom White added a comment -

        I haven't written a migration script yet, but I'll try to do so before 0.13.

        Show
        Tom White added a comment - I haven't written a migration script yet, but I'll try to do so before 0.13.
        Hide
        Tom White added a comment -

        This patch (HADOOP-1061-v4.patch) includes MigrationTool to migrate the data written using an older version of S3FileSystem to the latest version. I have successfully tested it on an old filesystem.

        Show
        Tom White added a comment - This patch ( HADOOP-1061 -v4.patch) includes MigrationTool to migrate the data written using an older version of S3FileSystem to the latest version. I have successfully tested it on an old filesystem.
        Show
        Hadoop QA added a comment - +1 http://issues.apache.org/jira/secure/attachment/12356557/HADOOP-1061-v4.patch applied and successfully tested against trunk revision r533233. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/97/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/97/console
        Hide
        Doug Cutting added a comment -

        +1 This looks good to me.

        Show
        Doug Cutting added a comment - +1 This looks good to me.
        Hide
        Tom White added a comment -

        I've just committed this.

        Show
        Tom White added a comment - I've just committed this.
        Hide
        Hadoop QA added a comment -
        Show
        Hadoop QA added a comment - Integrated in Hadoop-Nightly #77 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/77/ )

          People

          • Assignee:
            Unassigned
            Reporter:
            Mike Smith
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development