Cassandra
  1. Cassandra
  2. CASSANDRA-4898

Authentication provider in Cassandra itself

    Details

      Description

      I've been working on an implementation for both IAuthority2 and IAuthenticator that uses Cassandra itself to store the necessary credentials. I'm planning on open sourcing this shortly.

      Is there any interest in this? It tries to provide reasonable security, for example using PBKDF2 to store passwords with a configurable configuration cycle and managing all the rights available in IAuthority2.

      My main use goal isn't security / confidentiality of the data, but more that I don't want multiple consumers of the cluster to accidentally screw stuff up. Only certain users can write data, others can read it out again and further process it.

      I'm planning on releasing this soon under an open source license (probably the same as Cassandra itself). Would there be interest in incorporating it as a new reference implementation instead of the properties file implementation perhaps? Or can I better maintain it separately? I would love if people from the community would want to review it, since I have been dabbling in the Cassandra source code only for a short while now.

      During the development of this I've encountered a few bumps and I wonder whether they could be addressed or not.

      = Moment when validateConfiguration() runs =

      Is there a deliberate reason that validateConfiguration() is executed before all information about keyspaces, column families etc. is available? In the current form I therefore can't validate whether column families etc. are available for authentication since they aren't loaded yet.

      I've wanted to use this to make relatively easy bootstrapping possible. My approach here would be to only enable authentication if the needed keyspace is available. This allows for configuring the cluster, then import the necessary authentication data for an admin user to bootstrap further and then restart every node in the cluster.

      Basically the questions here are, can the moment when validateConfiguration() runs for an authentication provider be changed? Is this approach to bootstrapping reasonable or do people have better ideas?

      = AbstractReplicationStrategy has package visible constructor =

      I've added a strategy that basically says that data should be available on all nodes. The amount of data use for authentication is very limited. Replicating it to every node is there for not very problematic and allows for every node to have all data locally available for verifying requests.

      I wanted to put this strategy into it's own package inside the authentication module, but since the constructor of AbstractReplicationStrategy has no visibility explicitly marked, it's only available inside the same package.

      I'm not sure whether implementing a strategy to replicate data to all nodes is a sane idea and whether my implementation of this strategy is correct. What do you people think of this? Would people want to review the implementation?

        Activity

        Hide
        Jonathan Ellis added a comment -

        IMO this would be a good candidate for a new reference implementation.

        Note that IAuth2 is kind of broken (oops) and we'll be redoing that in CASSANDRA-4874 and CASSANDRA-4875. I imagine most of your work though will be around internals. (I would suggest using QueryProcessor.processInternal to keep the pain level down – see examples in SystemTable.)

        I'm not sure whether implementing a strategy to replicate data to all nodes is a sane idea

        This is overkill.

        Suggest instead creating a system_auth keyspace, default to SimpleStrategy/RF=1. Users can then change this using normal tools.

        Show
        Jonathan Ellis added a comment - IMO this would be a good candidate for a new reference implementation. Note that IAuth2 is kind of broken (oops) and we'll be redoing that in CASSANDRA-4874 and CASSANDRA-4875 . I imagine most of your work though will be around internals. (I would suggest using QueryProcessor.processInternal to keep the pain level down – see examples in SystemTable.) I'm not sure whether implementing a strategy to replicate data to all nodes is a sane idea This is overkill. Suggest instead creating a system_auth keyspace, default to SimpleStrategy/RF=1. Users can then change this using normal tools.
        Hide
        Jonathan Ellis added a comment -

        Is there a deliberate reason that validateConfiguration() is executed before all information about keyspaces, column families etc. is available?

        I think the idea there is more "validate that this instance has been configured in a sane way," not "validate CF/KS permissions."

        Show
        Jonathan Ellis added a comment - Is there a deliberate reason that validateConfiguration() is executed before all information about keyspaces, column families etc. is available? I think the idea there is more "validate that this instance has been configured in a sane way," not "validate CF/KS permissions."
        Hide
        Dirkjan Bussink added a comment -

        Well, on the validateConfiguration() issue, the thing here is more that it might be useful to check whether the KS and CF's for authentication exist.

        On the other hand, if an implementation like this would be provided as standard, there could always be a setup of default authentication credentials.

        Regarding the all node strategy, how exactly does this work? Are permissions cached for a certain client? I could imagine it being a relatively big overhead if the permission check needs to be made for example for each single write over the same connection with the same credentials.

        Currently I'm using StorageProxy.read (reading password and permissions) and StorageProxy.mutate (handling grants) and I'm not going crazy because of that . These are btw the column families I'm using atm (using the custom strategy but of course not necessary): https://gist.github.com/7231bc6d23331ecfb07b

        The only thing that needs some explanation there is probably the serialization format. The password is a combination of PBKDF2 iterations, key length, salt and hash. The permissions is a long which is composed of two 32 bit masks, one for the grant rights and one for the access rights. Since Permission explicitly states the ordinal values can be used, I chose to do that.

        Show
        Dirkjan Bussink added a comment - Well, on the validateConfiguration() issue, the thing here is more that it might be useful to check whether the KS and CF's for authentication exist. On the other hand, if an implementation like this would be provided as standard, there could always be a setup of default authentication credentials. Regarding the all node strategy, how exactly does this work? Are permissions cached for a certain client? I could imagine it being a relatively big overhead if the permission check needs to be made for example for each single write over the same connection with the same credentials. Currently I'm using StorageProxy.read (reading password and permissions) and StorageProxy.mutate (handling grants) and I'm not going crazy because of that . These are btw the column families I'm using atm (using the custom strategy but of course not necessary): https://gist.github.com/7231bc6d23331ecfb07b The only thing that needs some explanation there is probably the serialization format. The password is a combination of PBKDF2 iterations, key length, salt and hash. The permissions is a long which is composed of two 32 bit masks, one for the grant rights and one for the access rights. Since Permission explicitly states the ordinal values can be used, I chose to do that.
        Hide
        Jonathan Ellis added a comment -

        Well, on the validateConfiguration() issue, the thing here is more that it might be useful to check whether the KS and CF's for authentication exist.

        I'd rather add hooks to dropping keyspaces so we can clear things out then.

        I could imagine it being a relatively big overhead if the permission check needs to be made

        We're discussing adding permission caching in CASSANDRA-4295, your help would be welcome.

        These are btw the column families I'm using atm

        Should really update this to CQL3 (see http://www.datastax.com/dev/blog/cql3-for-cassandra-experts and http://www.datastax.com/dev/blog/thrift-to-cql3). Then you can use Map or Set collections for the permissions instead of a custom serializer.

        Show
        Jonathan Ellis added a comment - Well, on the validateConfiguration() issue, the thing here is more that it might be useful to check whether the KS and CF's for authentication exist. I'd rather add hooks to dropping keyspaces so we can clear things out then. I could imagine it being a relatively big overhead if the permission check needs to be made We're discussing adding permission caching in CASSANDRA-4295 , your help would be welcome. These are btw the column families I'm using atm Should really update this to CQL3 (see http://www.datastax.com/dev/blog/cql3-for-cassandra-experts and http://www.datastax.com/dev/blog/thrift-to-cql3 ). Then you can use Map or Set collections for the permissions instead of a custom serializer.
        Hide
        Dirkjan Bussink added a comment -

        I've published the current work here:

        https://github.com/nedap/cassandra-auth

        The biggest issue with switching to CQL3 internally at the moment is that the processInternal API is different in 1.1.x and 1.2.x / trunk it seems. I haven't switched this over to CQL3 for that reason, since our current cluster that we run is running 1.1.5 at the moment.

        The same goes for moving it to using Map or Set collections.

        I don't think it would be much work for this, I'll probably update it anyway when we switch to 1.2.x which probably isn't long after the release.

        Show
        Dirkjan Bussink added a comment - I've published the current work here: https://github.com/nedap/cassandra-auth The biggest issue with switching to CQL3 internally at the moment is that the processInternal API is different in 1.1.x and 1.2.x / trunk it seems. I haven't switched this over to CQL3 for that reason, since our current cluster that we run is running 1.1.5 at the moment. The same goes for moving it to using Map or Set collections. I don't think it would be much work for this, I'll probably update it anyway when we switch to 1.2.x which probably isn't long after the release.
        Hide
        Jonathan Ellis added a comment -

        One wrinkle I forgot about: CASSANDRA-4648 made executeInternal local-only. So we'd need to re-add a method to execute a CQL query that may have non-local answers. Should be pretty easy to pull that code out of the pre-4648 class though.

        Show
        Jonathan Ellis added a comment - One wrinkle I forgot about: CASSANDRA-4648 made executeInternal local-only. So we'd need to re-add a method to execute a CQL query that may have non-local answers. Should be pretty easy to pull that code out of the pre-4648 class though.
        Hide
        Aleksey Yeschenko added a comment - - edited

        https://github.com/iamaleksey/cassandra/compare/4898

        Besides adding CQL3-based IAuthenticator and IAuthorizer implementations, this branch also:

        • removes SimpleAuth examples since they are no longer actually good examples after the interfaces changed
        • makes small backwards-compatible changes to IAuthenticator and IAuthorizer declared exceptions
        • drops limitations on protectedResources except for schema modification

        What I'm not 100% sure about is naming - especially for the authorizer. The only thing that matters about IAuthorizer implementations is where they keep permissions data, so CassandraAuthorizer makes sense, but I'm afraid it's too generic. Haven't come up with a better name though.

        What needs to be done - after/if the names are confirmed:

        • add an entry to NEWS about the new implementations and about the alterable system_auth ks (CASSANDRA-5112)
        • add a NEWS entry for permissions_validity_in_ms (CASSANDRA-4295)
        • possibly a comment in cassandra.yaml about the available implementations ?
        • dtests for the whole thing - now that we've got working baked-in implementations, we can and should test all the auth-related cql3 statements
        Show
        Aleksey Yeschenko added a comment - - edited https://github.com/iamaleksey/cassandra/compare/4898 Besides adding CQL3-based IAuthenticator and IAuthorizer implementations, this branch also: removes SimpleAuth examples since they are no longer actually good examples after the interfaces changed makes small backwards-compatible changes to IAuthenticator and IAuthorizer declared exceptions drops limitations on protectedResources except for schema modification What I'm not 100% sure about is naming - especially for the authorizer. The only thing that matters about IAuthorizer implementations is where they keep permissions data, so CassandraAuthorizer makes sense, but I'm afraid it's too generic. Haven't come up with a better name though. What needs to be done - after/if the names are confirmed: add an entry to NEWS about the new implementations and about the alterable system_auth ks ( CASSANDRA-5112 ) add a NEWS entry for permissions_validity_in_ms ( CASSANDRA-4295 ) possibly a comment in cassandra.yaml about the available implementations ? dtests for the whole thing - now that we've got working baked-in implementations, we can and should test all the auth-related cql3 statements
        Hide
        Jonathan Ellis added a comment -

        LGTM. Okay with the class names.

        Show
        Jonathan Ellis added a comment - LGTM. Okay with the class names.
        Hide
        Aleksey Yeschenko added a comment -

        Thanks, committed.

        Created CASSANDRA-5258 in order to not forget about the dtests.

        Show
        Aleksey Yeschenko added a comment - Thanks, committed. Created CASSANDRA-5258 in order to not forget about the dtests.

          People

          • Assignee:
            Aleksey Yeschenko
            Reporter:
            Dirkjan Bussink
            Reviewer:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development