Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-225

Expose HDFS as a WebDAV store



    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None


      WebDAV stands for Distributed Authoring and Versioning. It is a set of extensions to the HTTP protocol that lets users collaboratively edit and manage files on a remote web server. It is often considered as a replacement for NFS or SAMBA

      HDFS (Hadoop Distributed File System) needs a friendly file system interface. DFSShell commands are unfamiliar. Instead it is more convenient for Hadoop users to use a mountable network drive. A friendly interface to HDFS will be used both for casual browsing of data and for bulk import/export.

      The FUSE provider for HDFS is already available ( http://issues.apache.org/jira/browse/HADOOP-17 ) but it had scalability problems. WebDAV is a popular alternative.

      The typical licensing terms for WebDAV tools are also attractive:
      GPL for Linux client tools that Hadoop would not redistribute anyway.
      More importantly, Apache Project/Apache license for Java tools and for server components.
      This allows for a tighter integration with the HDFS code base.

      There are some interesting Apache projects that support WebDAV.
      But these are probably too heavyweight for the needs of Hadoop:
      Tomcat servlet: http://tomcat.apache.org/tomcat-4.1-doc/catalina/docs/api/org/apache/catalina/servlets/WebdavServlet.html
      Slide: http://jakarta.apache.org/slide/

      Being HTTP-based and "backwards-compatible" with Web Browser clients, the WebDAV server protocol could even be piggy-backed on the existing Web UI ports of the Hadoop name node / data nodes. WebDAV can be hosted as (Jetty) servlets. This minimizes server code bloat and this avoids additional network traffic between HDFS and the WebDAV server.

      General Clients (read-only):
      Any web browser

      Linux Clients:
      Mountable GPL davfs2 http://dav.sourceforge.net/
      FTP-like GPL Cadaver http://www.webdav.org/cadaver/

      Server Protocol compliance tests:
      A goal is for Hadoop HDFS to pass this test (minus support for Properties)

      Pure Java clients:
      DAV Explorer Apache lic. http://www.ics.uci.edu/~webdav/

      WebDAV also makes it convenient to add advanced features in an incremental fashion:
      file locking, access control lists, hard links, symbolic links.
      New WebDAV standards get accepted and more or less featured WebDAV clients exist.
      core http://www.webdav.org/specs/rfc2518.html
      ACLs http://www.webdav.org/specs/rfc3744.html
      redirects "soft links" http://greenbytes.de/tech/webdav/rfc4437.html
      BIND "hard links" http://www.webdav.org/bind/
      quota http://tools.ietf.org/html/rfc4331


        1. hadoop-496-3.patch
          43 kB
          Michael Bieniosek
        2. hadoop-496-4.patch
          38 kB
          Michael Bieniosek
        3. hadoop-496-5.tgz
          7 kB
          Michael Bieniosek
        4. hadoop-496-spool-cleanup.patch
          43 kB
          Michael Bieniosek
        5. hadoop-webdav.zip
          15 kB
          Albert Strasheim
        6. jetty-slide.xml
          2 kB
          Albert Strasheim
        7. lib.webdav.tar.gz
          2.14 MB
          Enis Soztutar
        8. screenshot-1.jpg
          54 kB
          Pete Wyckoff
        9. slideusers.properties
          0.0 kB
          Albert Strasheim
        10. webdav_wip1.patch
          41 kB
          Enis Soztutar
        11. webdav_wip2.patch
          43 kB
          Enis Soztutar

        Issue Links



              enis Enis Soztutar
              michel_tourn Michel Tourn
              10 Vote for this issue
              31 Start watching this issue

