Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16396

Allow authoritative mode on a subdirectory

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0
    • fs/s3
    • None

    Description

      Let's allow authoritative mode to be applied only to a subset of a bucket. This is coming primarily from a Hive warehousing use-case where Hive-managed tables can benefit from query planning, but can't speak for the rest of the bucket. This should be limited in scope and is not a general attempt to allow configuration on a per-path basis, as configuration is currently done on a per-process of a per-bucket basis.

      I propose a new property (we could overload fs.s3a.metadatastore.authoritative, but that seems likely to cause confusion somewhere). A string would be allowed that would then be qualified in the context of the FileSystem, and used to check if it is a prefix for a given path. If it is, we act as though authoritative mode is enabled. If not, we revert to the existing behavior of fs.s3a.metadatastore.authoritative (which in practice will probably be false, the default, if the new property is in use).

      Let's be clear about a few things:

      • Currently authoritative mode only short-cuts the process to avoid a round-trip to S3 if we know it is safe to do so. This means that even when authoritative mode is enabled for a bucket, if the metadata store does not have a complete (or "authoritative") current listing cached, authoritative mode still has no effect. This will still apply.
      • This will only apply to getFileStatus and listStatus, and internal calls to their internal counterparts. No other API is currently using authoritative mode to change behavior.
      • This will only apply to getFileStatus and listStatus calls INSIDE the configured prefix. If there is a recursvie listing on the parent of the configured prefix, no change in behavior will be observed.

      Attachments

        1. HADOOP-16396.003.patch
          28 kB
          Sean Mackrory
        2. HADOOP-16396.002.patch
          25 kB
          Sean Mackrory
        3. HADOOP-16396.001.patch
          25 kB
          Sean Mackrory

        Issue Links

          Activity

            People

              mackrorysd Sean Mackrory
              mackrorysd Sean Mackrory
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: