Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11444

Jets3tFileSystemStore fails to remove initial slash from object keys, resulting in objects with double forward slashes being stored

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • 2.2.0
    • None
    • fs/s3
    • None
    • java version "1.7.0_71"
      Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
      Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)

    Description

      While writing to S3 using Spark 1.2.0's ReceiverInputDStream#saveAsTextFiles with a S3 URL ("s3://fake-test/1234"), I noticed that files are written with double forward slashes (e.g. "s3://fake-test//1234/-1419334280000/").

      After debugging, it seems this is caused by Jets3tFileSystemStore#pathToKey(path), which returns "/fake-test/1234/..." for the input "s3://fake-test/1234/...". when it should hack off the first forward slash.

      When I used a s3n URL and hence Jets3tNativeFileSystemStore, the double slashes went away. Here are the comparison between their pathToKey implementation:

      Jets3tNativeFileSystemStore's implementation of pathToKey is:

        private static String pathToKey(Path path) {
          if (path.toUri().getScheme() != null && path.toUri().getPath().isEmpty()) {
            // allow uris without trailing slash after bucket to refer to root,
            // like s3n://mybucket
            return "";
          }
          if (!path.isAbsolute()) {
            throw new IllegalArgumentException("Path must be absolute: " + path);
          }
          String ret = path.toUri().getPath().substring(1); // remove initial slash
          if (ret.endsWith("/") && (ret.indexOf("/") != ret.length() - 1)) {
            ret = ret.substring(0, ret.length() -1);
        }
          return ret;
        }
      

      whereas Jets3tFileSystemStore uses:

        private String pathToKey(Path path) {
          if (!path.isAbsolute()) {
            throw new IllegalArgumentException("Path must be absolute: " + path);
          }
          return path.toUri().getPath();
        }
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eshioji Enno Shioji
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: