Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14383

Implement FileSystem that reads from HTTP / HTTPS endpoints

VotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-alpha4
    • fs
    • None
    • Reviewed

    Description

      We have a use case where YARN applications would like to localize resources from Artifactory. Putting the resources on HDFS itself might not be ideal as we would like to leverage Artifactory to manage different versions of the resources.

      It would be nice to have something like HttpFileSystem that implements the Hadoop filesystem API and reads from a HTTP endpoint.

      Note that Samza has implemented the proposal by themselves:

      https://github.com/apache/samza/blob/master/samza-yarn/src/main/scala/org/apache/samza/util/hadoop/HttpFileSystem.scala

      The downside of this approach is that it requires the YARN cluster to put the Samza jar into the classpath for each NM.

      It would be much nicer for Hadoop to have this feature built-in.

      Attachments

        1. HADOOP-14383.000.patch
          12 kB
          Haohui Mai
        2. HADOOP-14383.001.patch
          13 kB
          Haohui Mai
        3. HADOOP-14383.002.patch
          14 kB
          Haohui Mai

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wheat9 Haohui Mai
            wheat9 Haohui Mai
            Votes:
            0 Vote for this issue
            Watchers:
            16 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment