This issue relates to providing document-level access control for Solr index data.
A related JIRA issue is: SOLR-1834. I thought it would be best if I created a separate JIRA issue, rather than tack on to SOLR-1834, as the approach here is somewhat different, and I didn't want to confuse things or step on Anders' good work.
There have been lots of discussions about document-level access in Solr using LCF, custom comoponents and the like. Access Control is one of those subjects that quickly spreads to lots of 'ratholes' to dive into. Even if not everyone agrees with the approaches taken here, it does, at the very least, highlight some of the salient issues surrounding access control in Solr, and will hopefully initiate a healthy discussion on the range of related requirements, with the aim of finding the optimum balance of requirements.
The approach taken here is document and schema agnostic - i.e. the access control is independant of what is or will be in the index, and no schema changes are required. This version doesn't include LDAP/AD integration, but could be added relatively easily (see Ander's very good work on this in SOLR-1834). Note that, at the moment, this version doesn't deal with /update, /replication etc., it's currently a /select thing at the moment (but it could be used for these).
This approach uses a SearchComponent subclass called SolrACLSecurity. Its configuration is read in from solrconfig.xml in the usual way, and the allow/deny configuration is split out into a config file called acl.xml.
acl.xml defines a number of users and groups (and 1 global for 'everyone'), and assigns 0 or more <acl-allow> and/or <acl-deny> elements.
When the SearchComponent is initialized, user objects are created and cached, including an 'allow' list and a 'deny' list.
When a request comes in, these lists are used to build filter queries ('allows' are OR'ed and 'denies' are NAND'ed), and then added to the query request.
Because the allow and deny elements are simply subsearch queries (e.g. <acl-allow>somefield:secret</acl-allow>, this mechanism will work on any stored data that can be queried, including already existing data.
One of the sticky problems with access control is how to determine who's asking for data. There are many approaches, and to stay in the generic vein the current mechanism uses http parameters for this.
For an initial search, a client includes a username=somename parameter and a hash=pwdhash hash of its password. If the request sends the correct parameters, the search is granted and a uuid parameter is returned in the response header. This uuid can then be used in subsequent requests from the client. If the request is wrong, the SearchComponent fails and will increment the user's failed login count (if a valid user was specified). If this count exceeds the configured lockoutThreshold, no further requests are granted until the lockoutTime has elapsed.
This mechanism protects against some types of attacks (e.g. CLRF, dictionary etc.), but it really needs container HTTPS as well (as would most other auth implementations). Incorporating SSL certificates for authentication and making the authentication mechanism pluggable would be a nice improvement (i.e. separate authentication from access control).
Another issue is how internal searchers perform autowarming etc. The solution here is to use a local key called 'SolrACLSecurityKey'. This key is local and [should be] unique to that server. firstSearcher, newSearcher et al then include this key in their parameters so they can perform autowarming without constraint. Again, there are likely many ways to achieve this, this approach is but one.
The attached rar holds the source and associated configuration. This has been tested on the 1.4 release codebase (search in the attached solrconfig.xml for SolrACLSecurity to find the relevant sections in this file).
I hope this proves helpful for people who are looking for this sort of functionality in Solr, and more generally to address how such a mechanism could ultimately be integrated into a future Solr release.