Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12090 Handling writes from HDFS to Provided storages
  3. HDFS-13934

Multipart uploaders to be created through API call to FileSystem/FileContext, not service loader

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.3.1
    • fs, fs/s3, hdfs
    • None

    Description

      the Multipart Uploaders are created via service loaders. This is troublesome

      1. HADOOP-12636, HADOOP-13323, HADOOP-13625 highlight how the load process forces the transient loading of dependencies. If a dependent class cannot be loaded (e.g aws-sdk is not on the classpath), that service won't load. Without error handling round the load process, this stops any uploader from loading. Even with that error handling, the performance hit of that load, especially with reshaded dependencies, hurts performance (HADOOP-13138).
      2. it makes wrapping the the load with any filter impossible, stops transitive binding through viewFS, mocking, etc.
      3. It complicates security in a kerberized world. If you have an FS instance of user A, then you should be able to create an MPU instance with that user's permissions. currently, if a service were to try to create one, you'd be looking at doAs() games around the service loading, and a more complex bind process.

      Proposed

      1. remove the service loader mech entirely
      2. add to FS & FC as createMultipartUploader(path) call, which will create one bound to the current FS, with its permissions, DTs, etc.

      Attachments

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: