Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0
    • Component/s: contrib
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      rest access too zookeeper using jax-rs via jersey.
    1. SPEC.txt
      10 kB
      Patrick Hunt
    2. SPEC.txt
      10 kB
      Patrick Hunt
    3. SPEC.txt
      10 kB
      Patrick Hunt
    4. SPEC.txt
      11 kB
      Patrick Hunt
    5. SPEC.txt
      11 kB
      Patrick Hunt
    6. rest.tar.gz
      1.70 MB
      Patrick Hunt
    7. rest.tar.gz
      1.70 MB
      Patrick Hunt
    8. rest.tar.gz
      1.70 MB
      Patrick Hunt
    9. rest.tar.gz
      1.70 MB
      Patrick Hunt

      Activity

      Hide
      Patrick Hunt added a comment -

      initial attempt at rest support – using jax-rs via jersey.

      untar the rest_2.tar.gz into the contrib directory
      there are a number of files that need to be added to lib so for this
      initial drop using tar/gz rather than patch.

      Currently readonly (get) via json is supported. See the readme for details and quickstart on trying it out.

      there's a python script in src that tests the currently supported features
      see the readme for quickstart including running this script

      -----------
      jersey/grizzly are under CDDL

      I think I followed the rules outlined in Category B: Reciprocal Licenses
      http://www.apache.org/legal/3party.html

      specifically I only included binaries for CDDL licensed code (jersey/grizzly)
      I also added NOTICE.txt specific to these libs
      I also ran RAT and it seems happy.

      we should verify before committing to svn.

      Show
      Patrick Hunt added a comment - initial attempt at rest support – using jax-rs via jersey. untar the rest_2.tar.gz into the contrib directory there are a number of files that need to be added to lib so for this initial drop using tar/gz rather than patch. Currently readonly (get) via json is supported. See the readme for details and quickstart on trying it out. there's a python script in src that tests the currently supported features see the readme for quickstart including running this script ----------- jersey/grizzly are under CDDL I think I followed the rules outlined in Category B: Reciprocal Licenses http://www.apache.org/legal/3party.html specifically I only included binaries for CDDL licensed code (jersey/grizzly) I also added NOTICE.txt specific to these libs I also ran RAT and it seems happy. we should verify before committing to svn.
      Hide
      Chris Darroch added a comment -

      Another option might be the expand on the mod_shmap and mod_socache_zookeeper httpd modules I wrote a while back. The latter maintains a ZooKeeper client connection for each httpd child process – these are shared across all HTTP requests handled by the process, so (as with the code attached to this issue, I think) ephemeral nodes aren't supported, nor are ACLs, watches, etc. The code is available under the Apache license at http://people.apache.org/~chrisd/projects/shared_map/.

      The shared-map module can harness a variety of "small object cache" providers to various parts of the URL namespace and then perform GET/PUT/DELETE against them. For the mod_socache_zookeeper provider these map to zoo_get(), zoo_set()/zoo_create(), and zoo_delete(). Nodes are created automatically when a PUT is made for a non-extant node.

      I need to refactor mod_socache_zookeeper and create a mod_zookeeper which deals with the business of starting/stopping ZooKeeper connections for each httpd child process, something like mod_dbd does for SQL DB connections. That will allow other modules to then acquire the ZK connection and make zoo_*() requests directly; mod_socache_zookeeper and mod_slotmem_zookeeper (yet to be written) then just devolve into the business of mapping URLs to specific ZK calls.

      For a REST-style interface that supported things like ACLs, sequences, stat data, etc. one could write a separate module (mod_zookeeper_rest or whatever) which supports a more complex mapping than is available through just the socache or slotmem APIs.

      However the REST interface is implemented, it would be nice, I think, to use HEAD -> zoo_exists(), GET -> zoo_get(), PUT -> zoo_set()/zoo_create(), and DELETE -> zoo_delete(). There's such a natural mapping of HTTP methods to ZK methods that it would seem to call out for use, as opposed to using bags of CGI arguments to POST requests or what have you.

      Show
      Chris Darroch added a comment - Another option might be the expand on the mod_shmap and mod_socache_zookeeper httpd modules I wrote a while back. The latter maintains a ZooKeeper client connection for each httpd child process – these are shared across all HTTP requests handled by the process, so (as with the code attached to this issue, I think) ephemeral nodes aren't supported, nor are ACLs, watches, etc. The code is available under the Apache license at http://people.apache.org/~chrisd/projects/shared_map/ . The shared-map module can harness a variety of "small object cache" providers to various parts of the URL namespace and then perform GET/PUT/DELETE against them. For the mod_socache_zookeeper provider these map to zoo_get(), zoo_set()/zoo_create(), and zoo_delete(). Nodes are created automatically when a PUT is made for a non-extant node. I need to refactor mod_socache_zookeeper and create a mod_zookeeper which deals with the business of starting/stopping ZooKeeper connections for each httpd child process, something like mod_dbd does for SQL DB connections. That will allow other modules to then acquire the ZK connection and make zoo_*() requests directly; mod_socache_zookeeper and mod_slotmem_zookeeper (yet to be written) then just devolve into the business of mapping URLs to specific ZK calls. For a REST-style interface that supported things like ACLs, sequences, stat data, etc. one could write a separate module (mod_zookeeper_rest or whatever) which supports a more complex mapping than is available through just the socache or slotmem APIs. However the REST interface is implemented, it would be nice, I think, to use HEAD -> zoo_exists(), GET -> zoo_get(), PUT -> zoo_set()/zoo_create(), and DELETE -> zoo_delete(). There's such a natural mapping of HTTP methods to ZK methods that it would seem to call out for use, as opposed to using bags of CGI arguments to POST requests or what have you.
      Hide
      Chris Darroch added a comment -

      Sorry, just to throw in a couple of additional thoughts; using httpd modules means all you need is a conventional Apache httpd instance. Requests look like (using the current minimalistic shmap/socache modules):

      GET /node1/node2 HTTP/1.0
      
      DELETE /node1/node2 HTTP/1.0
      
      PUT /node1/node3 HTTP/1.0
      Content-Length: 5
      
      12345
      
      Show
      Chris Darroch added a comment - Sorry, just to throw in a couple of additional thoughts; using httpd modules means all you need is a conventional Apache httpd instance. Requests look like (using the current minimalistic shmap/socache modules): GET /node1/node2 HTTP/1.0 DELETE /node1/node2 HTTP/1.0 PUT /node1/node3 HTTP/1.0 Content-Length: 5 12345
      Hide
      Patrick Hunt added a comment -

      Thanks for the input Chris. I think I'm going to stick with Java though.

      At the same time it would be interesting to have what you've done available in contrib. If you'd
      be willing to put together a patch for contrib we could include it. What types of use cases does
      this mod_* address? Any example of use?

      Show
      Patrick Hunt added a comment - Thanks for the input Chris. I think I'm going to stick with Java though. At the same time it would be interesting to have what you've done available in contrib. If you'd be willing to put together a patch for contrib we could include it. What types of use cases does this mod_* address? Any example of use?
      Hide
      Patrick Hunt added a comment -

      A spec detailing how to bind http to zk.

      Show
      Patrick Hunt added a comment - A spec detailing how to bind http to zk.
      Hide
      Chris Darroch added a comment -

      I took a quick look at the spec; nice! I like that it follows the spirit of the HTTP spec in the way I envisioned. A couple of minor thoughts:

      I think it would be nice to avoid POST entirely, if only because POST is so over-used that it no longer has any particular meaning except "send stuff to server". Suppose you used the trailing slash in the URI to distinguish meanings? Having hacked on mod_dir at various points this occurs to me as a possibility. For example:

      PUT /foo -> zoo_set(/foo)
      PUT /foo/ -> zoo_create(/foo)

      The same would work to overload GET:

      GET /foo -> zoo_get(/foo)
      GET /foo/ -> zoo_get_children(/foo)

      I especially like the GET cases because they correspond nicely to how URIs are mapped to files and directories.

      As for running code, I'm working (slowly) on a mod_zookeeper and you'll be the first to know if I get it to a releasable state. The basic purpose is similar to mod_dbd: provide connections that other modules can use however they choose.

      Show
      Chris Darroch added a comment - I took a quick look at the spec; nice! I like that it follows the spirit of the HTTP spec in the way I envisioned. A couple of minor thoughts: I think it would be nice to avoid POST entirely, if only because POST is so over-used that it no longer has any particular meaning except "send stuff to server". Suppose you used the trailing slash in the URI to distinguish meanings? Having hacked on mod_dir at various points this occurs to me as a possibility. For example: PUT /foo -> zoo_set(/foo) PUT /foo/ -> zoo_create(/foo) The same would work to overload GET: GET /foo -> zoo_get(/foo) GET /foo/ -> zoo_get_children(/foo) I especially like the GET cases because they correspond nicely to how URIs are mapped to files and directories. As for running code, I'm working (slowly) on a mod_zookeeper and you'll be the first to know if I get it to a releasable state. The basic purpose is similar to mod_dbd: provide connections that other modules can use however they choose.
      Hide
      Patrick Hunt added a comment -

      Chris thanks for reviewing this. In particular I think it's important both for users as well as implementors (interop) to have a clear spec.

      /foo and /foo/ were considered, but imo this is pretty hacky and prone to error (both user and potentially intermediate actors like proxies...) Perhaps my search background showing through - url canonicalization often ignores the trailing slash.

      /foo and /foo/* were also considered, this is a bit better imo (you might also support things like /foo/*/bar, yikes! )

      EOD I think it makes more sense to be explicit. it's not as "clean" but it allows us to follow
      the HTTP spec and it's more obvious what's going on when reading the client code that uses the REST service (even if you haven't read the spec "view=children" is going to be more obvious than trailing/non-trailing slash).

      Show
      Patrick Hunt added a comment - Chris thanks for reviewing this. In particular I think it's important both for users as well as implementors (interop) to have a clear spec. /foo and /foo/ were considered, but imo this is pretty hacky and prone to error (both user and potentially intermediate actors like proxies...) Perhaps my search background showing through - url canonicalization often ignores the trailing slash. /foo and /foo/* were also considered, this is a bit better imo (you might also support things like /foo/*/bar, yikes! ) EOD I think it makes more sense to be explicit. it's not as "clean" but it allows us to follow the HTTP spec and it's more obvious what's going on when reading the client code that uses the REST service (even if you haven't read the spec "view=children" is going to be more obvious than trailing/non-trailing slash).
      Hide
      Chris Darroch added a comment -

      Certainly the use of explicit CGI arguments will improve readability; no question about that. I was mostly thinking in terms of HTTP and WebDAV (I see there's also a WebDAV-related issue, ZOOKEEPER-37) and things like WebDAV-to-JSR-170 connectors, with which I've worked a little.

      If you're thinking about WebDAV at all, that implies node creation and updates with PUT, most likely. (Arguably you might use MKCOL for node creation, I suppose.) Content hierarchies like ZooKeeper's, where nodes are both "files" (i.e., contain data) and "folders" (i.e., have child nodes) fit somewhat imperfectly with the filesystem assumptions that underlie the PUT and MKCOL methods. That's neither here nor there, I suppose.

      Show
      Chris Darroch added a comment - Certainly the use of explicit CGI arguments will improve readability; no question about that. I was mostly thinking in terms of HTTP and WebDAV (I see there's also a WebDAV-related issue, ZOOKEEPER-37 ) and things like WebDAV-to-JSR-170 connectors, with which I've worked a little. If you're thinking about WebDAV at all, that implies node creation and updates with PUT, most likely. (Arguably you might use MKCOL for node creation, I suppose.) Content hierarchies like ZooKeeper's, where nodes are both "files" (i.e., contain data) and "folders" (i.e., have child nodes) fit somewhat imperfectly with the filesystem assumptions that underlie the PUT and MKCOL methods. That's neither here nor there, I suppose.
      Hide
      Patrick Hunt added a comment -

      Thanks for further input Chris, really appreciate you giving insight. I'm not an expert at REST so
      please feel free.

      WRT webdav - jersey has support for webdav being developed. I think it makes sense to have
      webdav client support (it's in windows for example) at some point. I'm focusing on rest initially in
      order to cover as many dev environments (scripting in particular) and developer experience
      as possible (devs more expr with rest than webdav typically)

      Also, I'm not sure how well zk fits into webdav given that we store data on znodes that can
      act as either directories or files. Perhaps we just need a "phantom" file that represents the data
      associated with a non-leaf znode.

      I did think about MKCOL initially, actually I was thinking of using it even in this JIRA, but in the end
      decided not to in order to stick within the confines of std REST.

      Show
      Patrick Hunt added a comment - Thanks for further input Chris, really appreciate you giving insight. I'm not an expert at REST so please feel free. WRT webdav - jersey has support for webdav being developed. I think it makes sense to have webdav client support (it's in windows for example) at some point. I'm focusing on rest initially in order to cover as many dev environments (scripting in particular) and developer experience as possible (devs more expr with rest than webdav typically) Also, I'm not sure how well zk fits into webdav given that we store data on znodes that can act as either directories or files. Perhaps we just need a "phantom" file that represents the data associated with a non-leaf znode. I did think about MKCOL initially, actually I was thinking of using it even in this JIRA, but in the end decided not to in order to stick within the confines of std REST.
      Hide
      Patrick Hunt added a comment -

      New version of the spec and updated jersey implementation.

      As before unpack the archive in the src/contrib directory. README for quickstart

      Show
      Patrick Hunt added a comment - New version of the spec and updated jersey implementation. As before unpack the archive in the src/contrib directory. README for quickstart
      Hide
      Patrick Hunt added a comment -

      Updated version of spec and rest contrib.

      This version has full set of tests and is ready for review, then commit to svn.

      Show
      Patrick Hunt added a comment - Updated version of spec and rest contrib. This version has full set of tests and is ready for review, then commit to svn.
      Hide
      Hadoop QA added a comment -

      -1 overall. Here are the results of testing the latest attachment
      http://issues.apache.org/jira/secure/attachment/12406569/rest.tar.gz
      against trunk revision 769079.

      +1 @author. The patch does not contain any @author tags.

      -1 tests included. The patch doesn't appear to include any new or modified tests.
      Please justify why no tests are needed for this patch.

      -1 patch. The patch command could not apply the patch.

      Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/51/console

      This message is automatically generated.

      Show
      Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12406569/rest.tar.gz against trunk revision 769079. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-vesta.apache.org/51/console This message is automatically generated.
      Hide
      Patrick Hunt added a comment -

      This latest update is small:

      1) updated the response structures with uris of referenced resources
      this makes it easier for callers to access resources (don't need to cons up the
      uri themselves) for example when processing children

      2) updated the python script to cleanup and use results of 1) (nice!)

      3) updated spec to detail the new response fields.

      Show
      Patrick Hunt added a comment - This latest update is small: 1) updated the response structures with uris of referenced resources this makes it easier for callers to access resources (don't need to cons up the uri themselves) for example when processing children 2) updated the python script to cleanup and use results of 1) (nice!) 3) updated spec to detail the new response fields.
      Hide
      Mahadev konar added a comment -

      this looks pretty good

      just two minor nits

      • can you add licence header to SPEC.txt?
      • also some javadoc to ZookeeperService.java?
      Show
      Mahadev konar added a comment - this looks pretty good just two minor nits can you add licence header to SPEC.txt? also some javadoc to ZookeeperService.java?
      Hide
      Patrick Hunt added a comment -

      Added license and javadoc, these are the only changes.

      Show
      Patrick Hunt added a comment - Added license and javadoc, these are the only changes.
      Hide
      Mahadev konar added a comment -

      I just committed this. great to have this. .thanks pat.

      Show
      Mahadev konar added a comment - I just committed this. great to have this. .thanks pat.
      Hide
      Hudson added a comment -

      Integrated in ZooKeeper-trunk #303 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/303/)
      . REST access to ZooKeeper (phunt via mahadev)

      Show
      Hudson added a comment - Integrated in ZooKeeper-trunk #303 (See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/303/ ) . REST access to ZooKeeper (phunt via mahadev)

        People

        • Assignee:
          Patrick Hunt
          Reporter:
          Patrick Hunt
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development