Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Ubuntu 9.10 64bit

    • Skill Level:
      Committers Level (Medium to Hard)

      Description

      It would be nice if CouchDB had a comprehensive offering for varying levels of access to documents and databases.

      Here are some ideas:

      o User lists are stored in the database, per database.
      o Roles and role membership are stored in the database, per database.
      o ACLs are stored in the database, per database.
      o CouchDB can use ACLs to store and simplify permissions for internal functionality (manage the db, manage users, add roles, add users to roles, etc...)
      o CouchApps can take advantage of the ACLs to support login/logout and arbitrary business rules as needed.
      o A simple API can be made to conduct role, ACL and ownership checks.

      I suppose there is some theory and discussion behind determining whether users, roles or both are stored in ACL rules. Also, something worth discussing is whether the checks are automatically performed by couchdb, or if views are to be performing checks prior to emitting data. Or both...

      Building all this into CouchDB would mean that it has a mechanism for complex applications to be developed. Ones that mandate privacy and other visibility concerns.

        Activity

        Hide
        Chris Anderson added a comment - - edited

        The current update authorization model is very solid, and probably won't be changing.

        There are some good ideas in the ticket regarding read authorization.

        Our big missing piece is per-database reader ACLs. It's not clear if these should be stored in local docs (non-replicating) or normal docs (so they replicate.)

        My guess is that we want them to replicate, as many app installations will span nodes.

        We probably want them in a document that only admins can edit, and I don't think we want the ACLs in _design documents. So maybe we need a new type of document. How does _security/foo sound?

        Currently the db_admins role list is checked against the userCtx roles as well as username. Which means we are dealing with a flat namespace. I've got some notes about the account branch that deal with this stuff that I'll be post soon as well.

        Show
        Chris Anderson added a comment - - edited The current update authorization model is very solid, and probably won't be changing. There are some good ideas in the ticket regarding read authorization. Our big missing piece is per-database reader ACLs. It's not clear if these should be stored in local docs (non-replicating) or normal docs (so they replicate.) My guess is that we want them to replicate, as many app installations will span nodes. We probably want them in a document that only admins can edit, and I don't think we want the ACLs in _design documents. So maybe we need a new type of document. How does _security/foo sound? Currently the db_admins role list is checked against the userCtx roles as well as username. Which means we are dealing with a flat namespace. I've got some notes about the account branch that deal with this stuff that I'll be post soon as well.
        Hide
        Iain Sproat added a comment -

        I'd like to see authorization based replication - i.e. only being able to replicate parts of the database for which the user is authorized.

        Show
        Iain Sproat added a comment - I'd like to see authorization based replication - i.e. only being able to replicate parts of the database for which the user is authorized.
        Hide
        Chris Anderson added a comment -

        We already have this, in the sense that replication uses the normal HTTP API. So if a user is not and admin, they will not be able to replicate _design documents to the target.

        Similarly, if the target has a validation function that says all docs must have a foo field, than any docs that are missing a foo field will not be replicated.

        Because CouchDB has not read-authorization model, there isn't the same thing for reads. When we add the ability to control read-access to databases, users will only be able to replicate from databases they can read.

        Show
        Chris Anderson added a comment - We already have this, in the sense that replication uses the normal HTTP API. So if a user is not and admin, they will not be able to replicate _design documents to the target. Similarly, if the target has a validation function that says all docs must have a foo field, than any docs that are missing a foo field will not be replicated. Because CouchDB has not read-authorization model, there isn't the same thing for reads. When we add the ability to control read-access to databases, users will only be able to replicate from databases they can read.
        Hide
        Alexander Trauzzi added a comment -

        The first part of this note is an aside which could be separated out into another discussion or issue if desired...

        It's important not to conflate the functionality of reading the DB from an application point of view (JSON request coming from a browser) and reading the DB as a function of infrastructure (a scheduled replication).

        I should note that I see the use in allowing replications to happen in the context of a user per database, but not per server. This allows for local copies of DB for higher availability, but they are only populated with data the user can see.
        Ultimately, replications and DBs will have to function in different contexts. Control over what replications a server performs is still a system-wide administrative task to be done with absolute authority.

        Example:
        1. http://www.giantsocialcdcollectionsite.net is running couchdb.
        2. I make a user there.
        3. I start to populate my CD collection and share information with others.\
        4. I'm going on a plane and want my CD collection list with me.
        5. I install CouchDB on my favorite OS via my favorite means.
        6. As I am an administrator on my laptop, I elect to replicate http://www.giantsocialcdcollectionsite.net.
        7. http://www.giantsocialcdcollectionsite.net requires that I authenticate so that it can filter my replication.

        I think you can see how incredibly neat that ends up being...Of course then some people will want ways to obfuscate the _design docs... But that's getting into the nitty gritties.

        Replication should function with indifference towards the DBs, users and CouchApps being replicated and have little to nothing to do with them. It should function behind the scenes as part of a broader, more general systems design.
        That's why things like per-DB roles and users may be nessecary to prevent the server and the databases from bleeding into each other.

        To me, replication is not something my users should even have to know about. You write one app and if you need to scale out, you simply set up replication.

        =

        Anyway...
        This issue that I have made here is over the permissions of authenticated users of the DB and what data they can and cannot read:

        Right now, there is no way to secure a request to http://mycouchdb/mydb/myprivatedata on CouchDB without involving cumbersome workarounds. I am loathe to think of the impact on the project if this is not addressed.

        Show
        Alexander Trauzzi added a comment - The first part of this note is an aside which could be separated out into another discussion or issue if desired... It's important not to conflate the functionality of reading the DB from an application point of view (JSON request coming from a browser) and reading the DB as a function of infrastructure (a scheduled replication). I should note that I see the use in allowing replications to happen in the context of a user per database, but not per server . This allows for local copies of DB for higher availability, but they are only populated with data the user can see. Ultimately, replications and DBs will have to function in different contexts. Control over what replications a server performs is still a system-wide administrative task to be done with absolute authority. Example: 1. http://www.giantsocialcdcollectionsite.net is running couchdb. 2. I make a user there. 3. I start to populate my CD collection and share information with others.\ 4. I'm going on a plane and want my CD collection list with me. 5. I install CouchDB on my favorite OS via my favorite means. 6. As I am an administrator on my laptop, I elect to replicate http://www.giantsocialcdcollectionsite.net . 7. http://www.giantsocialcdcollectionsite.net requires that I authenticate so that it can filter my replication. I think you can see how incredibly neat that ends up being...Of course then some people will want ways to obfuscate the _design docs... But that's getting into the nitty gritties. Replication should function with indifference towards the DBs, users and CouchApps being replicated and have little to nothing to do with them. It should function behind the scenes as part of a broader, more general systems design. That's why things like per-DB roles and users may be nessecary to prevent the server and the databases from bleeding into each other. To me, replication is not something my users should even have to know about. You write one app and if you need to scale out, you simply set up replication. = Anyway... This issue that I have made here is over the permissions of authenticated users of the DB and what data they can and cannot read: Right now, there is no way to secure a request to http://mycouchdb/mydb/myprivatedata on CouchDB without involving cumbersome workarounds. I am loathe to think of the impact on the project if this is not addressed.
        Hide
        Ben Liddicott added a comment - - edited

        See also https://github.com/pouchdb/express-pouchdb/issues/262

        0. Summary

        Convention + Filtered views + intersections = access control.

        Start with a standardised-by-convention honour-system access control scheme, which can be
        implemented client-side as an honour system, or enforced by a proxy. Add a system view
        which gives the sd for each document.

        Add filtered views to (COUCHDB-707) provide server-side support. Then add intersections
        (joins) with other filtered views so access control information is indexed.

        Finally add ability to optionally apply access control filters transparently to all views.

        1. Honour System access control.

        Extend the _security document to name additional groups than admin and member.
        Specifically add "reader" (implicit read of all non-system documents) and "restricted"
        (no implicit access), and allow adding any number of arbitrary groups.

        Extend the userCtx to contain the property "principals" being a list of DB groups the user is in.
        The user and the roles generate the list of principals at request time. Therefore only principals
        need to be consulted for
        access control.

        e.g. the

        { "db":"mailbox", "name":"bob", "roles":[ "cartographers" ], "principals":[ "user:bob", "role:cartographers", "group:project X Admins" "db:non-admin", "db:restricted" ] }

        Each document can have a new system property, _sd. This can be consulted to discover desired access.
        An absent _sd means anyone may access. An empty _sd means only admins. Otherwise as described by the sd.
        Levels are read-only, and full access (including delete).

        _sd:

        { "user:bob":"rw", "role:cartographers":"r", "group:project X Admins":"rw" }

        We now have a client-side honour system access control.

        2. Allow filter functions to be used for views and for document reads.
        This would allow users to specify a filter function which implemented
        access control to see only parts of the view they wish to see.

        The filter function would be applied before the reduce step. Queries without
        a filter function would proceed as at present. Queries with a filter function
        will

        On the honour system, this would be slow for views because it requires
        reading the document to retrieve the sd, unless the sd were output in the value.

        Need to ensure multiple filter functions can be used together.

        3. Allow intersection queries when querying views. (Yes, a bit like a join.)

        Specify multiple views (potentially each with their own filter).

        The output of the first view is the only one returned, but it is filtered by only accepting the document IDs
        which are produced by the other views.

        4. Create a new system view _doc_sd which emits an entry for each document _sd entry.
        If the _sd is empty, it will emit the implicit equivalents
        "documentID","db:readers","r"

        5. Now we can (if configured) implement access control by implicitly querying _doc_sd
        for intersection with the usrCtx.groups as keys.

        6. Write access control can, as now, be implemented by the validation function.

        7. Protection of namespaces from pollution by downlevel users can be provided by
        validation functions. These would consult a _namespace_sd document which would contain
        prefix to _sd mappings.

        Show
        Ben Liddicott added a comment - - edited See also https://github.com/pouchdb/express-pouchdb/issues/262 0. Summary Convention + Filtered views + intersections = access control. Start with a standardised-by-convention honour-system access control scheme, which can be implemented client-side as an honour system, or enforced by a proxy. Add a system view which gives the sd for each document. Add filtered views to ( COUCHDB-707 ) provide server-side support. Then add intersections (joins) with other filtered views so access control information is indexed. Finally add ability to optionally apply access control filters transparently to all views. 1. Honour System access control. Extend the _security document to name additional groups than admin and member. Specifically add "reader" (implicit read of all non-system documents) and "restricted" (no implicit access), and allow adding any number of arbitrary groups. Extend the userCtx to contain the property "principals" being a list of DB groups the user is in. The user and the roles generate the list of principals at request time. Therefore only principals need to be consulted for access control. e.g. the { "db":"mailbox", "name":"bob", "roles":[ "cartographers" ], "principals":[ "user:bob", "role:cartographers", "group:project X Admins" "db:non-admin", "db:restricted" ] } Each document can have a new system property, _sd. This can be consulted to discover desired access. An absent _sd means anyone may access. An empty _sd means only admins. Otherwise as described by the sd. Levels are read-only, and full access (including delete). _sd: { "user:bob":"rw", "role:cartographers":"r", "group:project X Admins":"rw" } We now have a client-side honour system access control. 2. Allow filter functions to be used for views and for document reads. This would allow users to specify a filter function which implemented access control to see only parts of the view they wish to see. The filter function would be applied before the reduce step. Queries without a filter function would proceed as at present. Queries with a filter function will On the honour system, this would be slow for views because it requires reading the document to retrieve the sd, unless the sd were output in the value. Need to ensure multiple filter functions can be used together. 3. Allow intersection queries when querying views. (Yes, a bit like a join.) Specify multiple views (potentially each with their own filter). The output of the first view is the only one returned, but it is filtered by only accepting the document IDs which are produced by the other views. 4. Create a new system view _doc_sd which emits an entry for each document _sd entry. If the _sd is empty, it will emit the implicit equivalents "documentID","db:readers","r" 5. Now we can (if configured) implement access control by implicitly querying _doc_sd for intersection with the usrCtx.groups as keys. 6. Write access control can, as now, be implemented by the validation function. 7. Protection of namespaces from pollution by downlevel users can be provided by validation functions. These would consult a _namespace_sd document which would contain prefix to _sd mappings.

          People

          • Assignee:
            Unassigned
            Reporter:
            Alexander Trauzzi
          • Votes:
            3 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development