Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.2
    • Component/s: None
    • Labels:
      None

      Description

      Solr needs an interface that makes it easy for different authorization systems to be plugged into it. Here's what I plan on doing:

      Define an interface SolrAuthorizationPlugin with one single method isAuthorized. This would take in a SolrRequestContext object and return an SolrAuthorizationResponse object. The object as of now would only contain a single boolean value but in the future could contain more information e.g. ACL for document filtering etc.

      The reason why we need a context object is so that the plugin doesn't need to understand Solr's capabilities e.g. how to extract the name of the collection or other information from the incoming request as there are multiple ways to specify the target collection for a request. Similarly request type can be specified by qt or /handler_name.

      Flow:
      Request -> SolrDispatchFilter -> isAuthorized(context) -> Process/Return.

      public interface SolrAuthorizationPlugin {
        public SolrAuthorizationResponse isAuthorized(SolrRequestContext context);
      }
      
      public  class SolrRequestContext {
        UserInfo; // Will contain user context from the authentication layer.
        HTTPRequest request;
        Enum OperationType; // Correlated with user roles.
        String[] CollectionsAccessed;
        String[] FieldsAccessed;
        String Resource;
      }
      
      
      public class SolrAuthorizationResponse {
        boolean authorized;
      
        public boolean isAuthorized();
      }
      

      User Roles:

      • Admin
      • Collection Level:
      • Query
      • Update
      • Admin

      Using this framework, an implementation could be written for specific security systems e.g. Apache Ranger or Sentry. It would keep all the security system specific code out of Solr.

      1. SOLR-7275.patch
        32 kB
        Anshum Gupta
      2. SOLR-7275.patch
        35 kB
        Anshum Gupta
      3. SOLR-7275.patch
        27 kB
        Anshum Gupta
      4. SOLR-7275.patch
        27 kB
        Anshum Gupta
      5. SOLR-7275.patch
        26 kB
        Anshum Gupta
      6. SOLR-7275.patch
        26 kB
        Anshum Gupta
      7. SOLR-7275.patch
        25 kB
        Noble Paul
      8. SOLR-7275.patch
        25 kB
        Anshum Gupta
      9. SOLR-7275.patch
        24 kB
        Anshum Gupta
      10. SOLR-7275.patch
        24 kB
        Anshum Gupta
      11. SOLR-7275.patch
        22 kB
        Anshum Gupta
      12. SOLR-7275.patch
        21 kB
        Anshum Gupta
      13. SOLR-7275.patch
        18 kB
        Anshum Gupta
      14. SOLR-7275.patch
        19 kB
        Anshum Gupta
      15. SOLR-7275.patch
        18 kB
        Anshum Gupta
      16. SOLR-7275.patch
        16 kB
        Anshum Gupta
      17. SOLR-7275.patch
        20 kB
        Anshum Gupta

        Issue Links

          Activity

          Hide
          Jan Høydahl added a comment -

          Linking up SOLR-7236, which is an attempt to gather planning of security interfaces for Solr in an umbrella issue. This topic deserves broad high-level discussions before starting a custom implementation.

          Show
          Jan Høydahl added a comment - Linking up SOLR-7236 , which is an attempt to gather planning of security interfaces for Solr in an umbrella issue. This topic deserves broad high-level discussions before starting a custom implementation.
          Hide
          Anshum Gupta added a comment -

          Jan, I am working on what I've described above as I think it's a straight framework that would allow for easy pluggability of multiple authorization implementations and would keep the implementation details for the security layer (Sentry/Ranger/...) outside of Solr for the most part.

          Show
          Anshum Gupta added a comment - Jan, I am working on what I've described above as I think it's a straight framework that would allow for easy pluggability of multiple authorization implementations and would keep the implementation details for the security layer (Sentry/Ranger/...) outside of Solr for the most part.
          Hide
          Dennis Gove added a comment -

          I like this concept but I think the response can be expanded to add a bit more functionality. It would be nice if the pluggable security layer could respond in such a way as to not wholly reject a request but to instead restrict what is returned from a request. It could accomplish this by providing additional filters to apply to a request.

          public class SolrAuthorizationResponse {
            boolean authorized;
            String additionalFilterQuery;
          
            ...
          }
          

          By adding additionalFilterQuery, this would give the security layer an opportunity to say, "yup, you're authorized but you can't see records matching this filter" or "yup, you're authorized but you can only see records also matching this filter". It provides a way to add fine-grained control of data access but keep that control completely outside of SOLR (as it would live in the pluggable security layer).

          Additionally, it allows the security layer to add fine-grained control *without notifying the user they are being restricted* as this lives wholly in the SOLR <---> security layer communication. There are times when telling the user their request was rejected due to it returning records they're not privileged to see actually gives the user some information you may not want them to know - the fact that these restricted records even exist. Instead, by adding filters and just not returning records the user isn't privileged for, the user is non-the-wiser that they were restricted at all.

          Show
          Dennis Gove added a comment - I like this concept but I think the response can be expanded to add a bit more functionality. It would be nice if the pluggable security layer could respond in such a way as to not wholly reject a request but to instead restrict what is returned from a request. It could accomplish this by providing additional filters to apply to a request. public class SolrAuthorizationResponse { boolean authorized; String additionalFilterQuery; ... } By adding additionalFilterQuery, this would give the security layer an opportunity to say, "yup, you're authorized but you can't see records matching this filter" or "yup, you're authorized but you can only see records also matching this filter". It provides a way to add fine-grained control of data access but keep that control completely outside of SOLR (as it would live in the pluggable security layer). Additionally, it allows the security layer to add fine-grained control * without notifying the user they are being restricted * as this lives wholly in the SOLR <---> security layer communication. There are times when telling the user their request was rejected due to it returning records they're not privileged to see actually gives the user some information you may not want them to know - the fact that these restricted records even exist. Instead, by adding filters and just not returning records the user isn't privileged for, the user is non-the-wiser that they were restricted at all.
          Hide
          Noble Paul added a comment -

          Yes. That is totally possible

          Show
          Noble Paul added a comment - Yes. That is totally possible
          Hide
          Jan Høydahl added a comment -

          By adding additionalFilterQuery, this would give the security layer an opportunity to say, "yup, you're authorized but you can't see records matching this filter"

          It does not feel clean to mix document level security filtering with authorization. The authz filter should either grant or deny a certain user access to perform a certain operation on a certain resource. There are existing means of filtering docs based on user, which will live perfectly fine in parallel with any new authz solution. Let's keep things simple.

          Show
          Jan Høydahl added a comment - By adding additionalFilterQuery, this would give the security layer an opportunity to say, "yup, you're authorized but you can't see records matching this filter" It does not feel clean to mix document level security filtering with authorization. The authz filter should either grant or deny a certain user access to perform a certain operation on a certain resource. There are existing means of filtering docs based on user, which will live perfectly fine in parallel with any new authz solution. Let's keep things simple.
          Hide
          Anshum Gupta added a comment - - edited

          Dennis Gove that's the intention behind returning a Response object instead of a plain boolean. As I'm not adding any of that in this phase, I didn't add those variables but yes, the intention is to use this object to return ACLs etc. from the authorization layer.

          I have the first patch ready, but I'm just in the process of cleaning it up a bit. I'll try and post it over the weekend.

          Show
          Anshum Gupta added a comment - - edited Dennis Gove that's the intention behind returning a Response object instead of a plain boolean. As I'm not adding any of that in this phase, I didn't add those variables but yes, the intention is to use this object to return ACLs etc. from the authorization layer. I have the first patch ready, but I'm just in the process of cleaning it up a bit. I'll try and post it over the weekend.
          Hide
          Anshum Gupta added a comment -

          Here's the first patch. This introduces the following:
          1. SolrAuthorizationPlugin interface - The interface that would need to be implemented for custom security plugins e.g. Ranger/Sentry/...
          2. Configuration mechanism for security - /security.json in zoo keeper.
          3. SolrRequestContext - HttpHeader, UserPrincipal etc. I'm working on extracting more context from the request e.g. Collection, handler, etc. and populate those in here.

          Usage:
          To try this out, you need to add /security.json node in zk, with the following data format

          {"class":"solr.SimpleSolrAuthorizationPlugin"}
          

          Also, access rules (black list for now) goes into /simplesecurity.json

          {"blacklist":["user1","user2"]}
          

          This uses the http param (uname) to filter out/authorize requests.
          The following request would then start returning 401:
          http://localhost:8983/solr/techproducts/select?q=*:*&wt=json&uname=user1

          NOTE: The authorization plugin doesn't really do anything about inter-shard communication (and doesn't propagate the user principal), it can be used only for black listing right now. You could write a plugin that sets up IP based rules or I could add those rules to the plugin that would be shipped out of the box to support white listing of User info + IP information.

          To summarize, I'm still working on the following:
          1. Extract more information and populate the context object.
          2. Have a watch on the access rules file. I'm still debating between a watch vs having an explicit RELOAD like call that updates the access rules.
          3. Support IP and/or user based whitelist.

          Show
          Anshum Gupta added a comment - Here's the first patch. This introduces the following: 1. SolrAuthorizationPlugin interface - The interface that would need to be implemented for custom security plugins e.g. Ranger/Sentry/... 2. Configuration mechanism for security - /security.json in zoo keeper. 3. SolrRequestContext - HttpHeader, UserPrincipal etc. I'm working on extracting more context from the request e.g. Collection, handler, etc. and populate those in here. Usage: To try this out, you need to add /security.json node in zk, with the following data format { "class" : "solr.SimpleSolrAuthorizationPlugin" } Also, access rules (black list for now) goes into /simplesecurity.json { "blacklist" :[ "user1" , "user2" ]} This uses the http param (uname) to filter out/authorize requests. The following request would then start returning 401: http://localhost:8983/solr/techproducts/select?q=*:*&wt=json&uname=user1 NOTE: The authorization plugin doesn't really do anything about inter-shard communication (and doesn't propagate the user principal), it can be used only for black listing right now. You could write a plugin that sets up IP based rules or I could add those rules to the plugin that would be shipped out of the box to support white listing of User info + IP information. To summarize, I'm still working on the following: 1. Extract more information and populate the context object. 2. Have a watch on the access rules file. I'm still debating between a watch vs having an explicit RELOAD like call that updates the access rules. 3. Support IP and/or user based whitelist.
          Hide
          Noble Paul added a comment - - edited

          A few comments

          • Do the initialization and other things of Authorization plugin in CoreContainer
          • This just does not feel right. CoreContainer SHOULD NOT be stored in PluginInfo. PluginInfo is supposed to be an immutable data structure that contains just data
            coreContainer = (CoreContainer) info.initArgs.get("cc");
            

          why multiple json files? /security.json and /simplesecurity.json ?

          Let us standardize one on one json file. Why should a component use SolrZkClient directly to read configuration?

          Show
          Noble Paul added a comment - - edited A few comments Do the initialization and other things of Authorization plugin in CoreContainer This just does not feel right. CoreContainer SHOULD NOT be stored in PluginInfo. PluginInfo is supposed to be an immutable data structure that contains just data coreContainer = (CoreContainer) info.initArgs.get("cc"); why multiple json files? /security.json and /simplesecurity.json ? Let us standardize one on one json file. Why should a component use SolrZkClient directly to read configuration?
          Hide
          Ishan Chattopadhyaya added a comment -

          The authorization plugin doesn't really do anything about inter-shard communication (and doesn't propagate the user principal),

          If a pre-authorized request is forwarded to another node, it shouldn't need to go via the authorization process again (e.g. a SolrDispatchFilter forwarded request, aka SDF.remoteQuery()). Towards that, don't you think the authorization plugin framework should have a way to let internode requests propagate some authorization info from the original user request?

          Show
          Ishan Chattopadhyaya added a comment - The authorization plugin doesn't really do anything about inter-shard communication (and doesn't propagate the user principal), If a pre-authorized request is forwarded to another node, it shouldn't need to go via the authorization process again (e.g. a SolrDispatchFilter forwarded request, aka SDF.remoteQuery()). Towards that, don't you think the authorization plugin framework should have a way to let internode requests propagate some authorization info from the original user request?
          Hide
          Noble Paul added a comment -

          Authorization is usually a low cost operation. I would say, lets us not optimize for it now.

          Show
          Noble Paul added a comment - Authorization is usually a low cost operation. I would say, lets us not optimize for it now.
          Hide
          Ishan Chattopadhyaya added a comment -

          but yes, the intention is to use this object to return ACLs etc. from the authorization layer.

          Noble Paul Even basides avoiding the re-authorization, don't you think ACLs/user principal etc. should to be propagated to other nodes during an internode request?

          Show
          Ishan Chattopadhyaya added a comment - but yes, the intention is to use this object to return ACLs etc. from the authorization layer. Noble Paul Even basides avoiding the re-authorization, don't you think ACLs/user principal etc. should to be propagated to other nodes during an internode request?
          Hide
          Noble Paul added a comment -

          It depends.

          If you are going to authenticate to the other nodes as the node itself then the original principal does not even matter. If you replacing to authenticate as the original user then all of it should be sent across

          Show
          Noble Paul added a comment - It depends. If you are going to authenticate to the other nodes as the node itself then the original principal does not even matter. If you replacing to authenticate as the original user then all of it should be sent across
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Anshum. I have a comment and a few questions:

          1. Since this issue is about creating a pluggable authorization module, please create a different linked issue for a default/basic implementation. We should not be discussing/reviewing any specific implementation here.
          2. We need to define the contract between this new plugin type and Solr better. For example:
            1. When is it created?
            2. When is it shutdown?
            3. How is it configured? How is the configuration changed?
            4. Is it per-collection or for the whole cluster?
            5. Can we have multiple such plugins?
            6. How are configuration changes propagated to different nodes? Is the plugin re-initialized or re-created?
          Show
          Shalin Shekhar Mangar added a comment - Thanks Anshum. I have a comment and a few questions: Since this issue is about creating a pluggable authorization module, please create a different linked issue for a default/basic implementation. We should not be discussing/reviewing any specific implementation here. We need to define the contract between this new plugin type and Solr better. For example: When is it created? When is it shutdown? How is it configured? How is the configuration changed? Is it per-collection or for the whole cluster? Can we have multiple such plugins? How are configuration changes propagated to different nodes? Is the plugin re-initialized or re-created?
          Hide
          Anshum Gupta added a comment - - edited

          Thanks for looking at it Noble.

          Do the initialization and other things of Authorization plugin in CoreContainer

          Sure, I'll move it.

          why multiple json files? /security.json and /simplesecurity.json ?

          They are used for 2 different reasons, the /security.json is the one used for/by Solr's framework whereas /simplesecurity.json could also be called anything else tomorrow and is used by a specific implementation. Doesn't make any sense to merge the 2 and Solr be bothered about a specific plugin data.

          Why should a component use SolrZkClient directly to read configuration?

          Is there a reason not to? The client would need information about zk and also might need to read information about it, I don't see a reason why we should create a new zkclient there.

          Show
          Anshum Gupta added a comment - - edited Thanks for looking at it Noble. Do the initialization and other things of Authorization plugin in CoreContainer Sure, I'll move it. why multiple json files? /security.json and /simplesecurity.json ? They are used for 2 different reasons, the /security.json is the one used for/by Solr's framework whereas /simplesecurity.json could also be called anything else tomorrow and is used by a specific implementation. Doesn't make any sense to merge the 2 and Solr be bothered about a specific plugin data. Why should a component use SolrZkClient directly to read configuration? Is there a reason not to? The client would need information about zk and also might need to read information about it, I don't see a reason why we should create a new zkclient there.
          Hide
          Anshum Gupta added a comment -

          That was my thought at this point. I'd want to propagate ACL/principal information when we get to say, Document level security but as I'm not handling that right now, let's just move ahead and build that in when we get to it.

          Show
          Anshum Gupta added a comment - That was my thought at this point. I'd want to propagate ACL/principal information when we get to say, Document level security but as I'm not handling that right now, let's just move ahead and build that in when we get to it.
          Hide
          Anshum Gupta added a comment -

          Thanks for looking at this Shalin.

          please create a different linked issue for a default/basic implementation.

          Sure, that makes sense. I'll split it and create a new sub-issue.

          We need to define the contract between this new plugin type and Solr better.

          I did try to answer most of those questions in the write up above but I'm happy to explain that again.

          • This framework is a cluster level configuration (and hence the need to specify and set up the implementation before starting a node).
          • It gets created in the init for SDF.
          • The configuration goes into /security.json (implementation to use) and plugin details go into plugin-specific file/mechanism e.g. you could write your own plugin that has hard-coded list for IPs or usernames or combination etc.
          • You can only have 1 such plugin used by a cluster at any given point (there's no check for that and the node would end up using the implementation defined in /security.json when it comes up.
          • As this happens at CoreContainer level, I didn't add anything to change the implementation type for plugin but I'm working on configuration changes. The change would need to be handled by the plugin writer e.g. in case of the OTB plugin, which depends on access rules stored in zk, it needs to watch that file for changes and update the blacklist accordingly. For cases involving 3rd party security mechanisms e.g. Apache Ranger/Sentry, the config changes would be handled by those plugins. When a request comes in after the access rules are updated, the plugin should be able to use the new rules from Ranger, without any need to update anything in Solr.
          • I'm moving this to be initialized and kept in the corecontainer, and also adding a shutdown hook for the plugins. The hook would get invoked during the corecontainer shutdown.
          Show
          Anshum Gupta added a comment - Thanks for looking at this Shalin. please create a different linked issue for a default/basic implementation. Sure, that makes sense. I'll split it and create a new sub-issue. We need to define the contract between this new plugin type and Solr better. I did try to answer most of those questions in the write up above but I'm happy to explain that again. This framework is a cluster level configuration (and hence the need to specify and set up the implementation before starting a node). It gets created in the init for SDF. The configuration goes into /security.json (implementation to use) and plugin details go into plugin-specific file/mechanism e.g. you could write your own plugin that has hard-coded list for IPs or usernames or combination etc. You can only have 1 such plugin used by a cluster at any given point (there's no check for that and the node would end up using the implementation defined in /security.json when it comes up. As this happens at CoreContainer level, I didn't add anything to change the implementation type for plugin but I'm working on configuration changes. The change would need to be handled by the plugin writer e.g. in case of the OTB plugin, which depends on access rules stored in zk, it needs to watch that file for changes and update the blacklist accordingly. For cases involving 3rd party security mechanisms e.g. Apache Ranger/Sentry, the config changes would be handled by those plugins. When a request comes in after the access rules are updated, the plugin should be able to use the new rules from Ranger, without any need to update anything in Solr. I'm moving this to be initialized and kept in the corecontainer, and also adding a shutdown hook for the plugins. The hook would get invoked during the corecontainer shutdown.
          Hide
          Noble Paul added a comment -

          whereas /simplesecurity.json could also be called anything else tomorrow and is used by a specific implementation. Doesn't make any sense to merge the 2 and Solr be bothered about a specific plugin data.

          I fail to see why it is not possible, If Solr already supports configuration of almost a dozen plugins with standardized formats , what is so special about an authorization plugin? This is akin to having solrconfig_myplugin.xml for every custom plugin I write

          Is there a reason not to? The client would need information about zk and also might need to read information about it, I don't see a reason why we should create a new zkclient there.

          There should be no need for the client to read stuff from ZK. Take the analogy of our current set of plugins. Do they ever try to read solrconfig.xml/ or configoverlay.json?

          Show
          Noble Paul added a comment - whereas /simplesecurity.json could also be called anything else tomorrow and is used by a specific implementation. Doesn't make any sense to merge the 2 and Solr be bothered about a specific plugin data. I fail to see why it is not possible, If Solr already supports configuration of almost a dozen plugins with standardized formats , what is so special about an authorization plugin? This is akin to having solrconfig_myplugin.xml for every custom plugin I write Is there a reason not to? The client would need information about zk and also might need to read information about it, I don't see a reason why we should create a new zkclient there. There should be no need for the client to read stuff from ZK. Take the analogy of our current set of plugins. Do they ever try to read solrconfig.xml/ or configoverlay.json?
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          +1 to a single /security.json file.
          I propose the following format for /security.json, to fold in both SOLR-7274 and SOLR-7275 (this issue) zk configs. This would need support for nested objects in ZkStateReader. The "configs" could be passed in into the plugins' init() and hence it will no longer be necessary to pass in a ZkClient/ZkController into the plugin's init().

          {
            "authentication": {
              "pluginClass": "org.blahblah",
              "pluginName": "kerberos",
              "configs": {
                "prop1": "val1",
                "prop2": "val2"
              }
            },
            "authorization": {
              "pluginClass": "...",
              "pluginName": "ranger",
              "configs": {
                "prop1": "val1",
                "prop2": "val2"
              }
            }
          }
          
          Show
          Ishan Chattopadhyaya added a comment - - edited +1 to a single /security.json file. I propose the following format for /security.json, to fold in both SOLR-7274 and SOLR-7275 (this issue) zk configs. This would need support for nested objects in ZkStateReader. The "configs" could be passed in into the plugins' init() and hence it will no longer be necessary to pass in a ZkClient/ZkController into the plugin's init(). { "authentication": { "pluginClass": "org.blahblah", "pluginName": "kerberos", "configs": { "prop1": "val1", "prop2": "val2" } }, "authorization": { "pluginClass": "...", "pluginName": "ranger", "configs": { "prop1": "val1", "prop2": "val2" } } }
          Hide
          Anshum Gupta added a comment - - edited

          I'm just trying to keep custom plugin config for security separate from other configuration. About merging authc and authz configs, that was on my mind and I plan to do it when I'm integrating the changes here with SOLR-7274.

          Let's consider an example of a user wanting to use some proprietary non-json format data in a custom security plugin, to store access rules. There wouldn't be a way to do that. I am all for exploring more options if there are any as long as they don't stop users from doing their own thing.

          I can have a straight mechanism to just read the authorization part of /security.json and pass that map to the plugin during init instead of the plugin reading from a file directly, but then instead of the security plugin deciding if it wants to keep a watch on the file, Solr would always keep a watch (when authz is enabled). In cases where access rules don't reside in zk and are in a 3rd party system, we don't want to keep a watch. Allowing the plugin to make that choice might be a better way to move.

          I'm about to separate out the implementation of default/OTB plugin from this JIRA and I guess things would be clearer for everyone to understand after that happens.

          Show
          Anshum Gupta added a comment - - edited I'm just trying to keep custom plugin config for security separate from other configuration. About merging authc and authz configs, that was on my mind and I plan to do it when I'm integrating the changes here with SOLR-7274 . Let's consider an example of a user wanting to use some proprietary non-json format data in a custom security plugin, to store access rules. There wouldn't be a way to do that. I am all for exploring more options if there are any as long as they don't stop users from doing their own thing. I can have a straight mechanism to just read the authorization part of /security.json and pass that map to the plugin during init instead of the plugin reading from a file directly, but then instead of the security plugin deciding if it wants to keep a watch on the file, Solr would always keep a watch (when authz is enabled). In cases where access rules don't reside in zk and are in a 3rd party system, we don't want to keep a watch. Allowing the plugin to make that choice might be a better way to move. I'm about to separate out the implementation of default/OTB plugin from this JIRA and I guess things would be clearer for everyone to understand after that happens.
          Hide
          Noble Paul added a comment -

          Allowing the plugin to make that choice might be a better way to move.

          Notifying the plugin of conf changes is not same as reloading the plugin. When a data change callback is received , the plugin can decide what to do with the the callbck. How the API is designed is upto you.

          Solr components watching ZK nodes is a big red flag. We should not do it

          Show
          Noble Paul added a comment - Allowing the plugin to make that choice might be a better way to move. Notifying the plugin of conf changes is not same as reloading the plugin. When a data change callback is received , the plugin can decide what to do with the the callbck. How the API is designed is upto you. Solr components watching ZK nodes is a big red flag. We should not do it
          Hide
          Anshum Gupta added a comment - - edited

          Patch for just the framework. It includes the following changes:

          • All security config now goes into /security.json. Here's a sample:
            {"authorization":
              {"class":"solr.SimpleSolrAuthorizationPlugin",
              "deny":["user1","user2"]
              }
            }
            

            Everything from the authorization section would be passed on to the authorization plugin.

          • After a discussion with Noble, it made more sense to restrict the access at this point to what's required and if we feel the need to expose zkStateReader, zkClient or the coreContainer itself to the plugin, we could do that at a later time. For now, all configuration for the plugin should just go into the authorization section.
          • SimpleSolrAuthorizationPlugin is no longer in the direct scope of this issue, but I created a new issue for that. I'll put up the updated patch, that doesn't use the coreContainer or access zk directly right after posting this patch.
          • I tried to have the SolrAuthorizationPlugin extend MapInitializedPlugin but that was more invasive than required as it takes a Map<String, String> whereas we need a Map<String, Object> here. I'll see if there's a way to do that without being invasive or duplicating a lot of the code so it still just extends PluginInfoInitialized.

          Here are the steps to test this patch (you would need the patch from SOLR-7402 along with this patch to actually see this work, as this is just the framework):
          1. Start zk.
          2. Using zk CLI,

          set /security.json {"authorization":{"class":"solr.SimpleSolrAuthorizationPlugin","deny":["user1","user2"]}}

          Change the userlist to whatever you want to use to test.
          3. Start SolrCloud, make sure you use the zk server/ensemble you just setup.
          e.g.

          bin/solr start -z localhost:2181 -c

          To write your own plugin:

          1. Extend SolrAuthorizationPlugin, this requires implementing
            1. init(PluginInfo info) : This is called when the node comes up. It is called once and creates the plugin object. This is where you can define your connections to 3rd party security systems etc.
            2. authorize(SolrRequestContext context) and optionally overriding
          2. Follow the steps above (testing this patch) and specify the FQN of the implementation class in the authorization{{class}}. e.g.
            {"authorization":{"class":"solr.MyCustomSolrAuthorizationPlugin","connect_string":"hostname:port/somepath"}}
            

          Still todo:

          • Work on the context object.
          • Cleanup/shutdown hook.
          • Reloading/watching for changes in the /security.json file or documenting how to change it.
          • Tests
          Show
          Anshum Gupta added a comment - - edited Patch for just the framework. It includes the following changes: All security config now goes into /security.json. Here's a sample: { "authorization" : { "class" : "solr.SimpleSolrAuthorizationPlugin" , "deny" :[ "user1" , "user2" ] } } Everything from the authorization section would be passed on to the authorization plugin. After a discussion with Noble, it made more sense to restrict the access at this point to what's required and if we feel the need to expose zkStateReader, zkClient or the coreContainer itself to the plugin, we could do that at a later time. For now, all configuration for the plugin should just go into the authorization section. SimpleSolrAuthorizationPlugin is no longer in the direct scope of this issue, but I created a new issue for that. I'll put up the updated patch, that doesn't use the coreContainer or access zk directly right after posting this patch. I tried to have the SolrAuthorizationPlugin extend MapInitializedPlugin but that was more invasive than required as it takes a Map<String, String> whereas we need a Map<String, Object> here. I'll see if there's a way to do that without being invasive or duplicating a lot of the code so it still just extends PluginInfoInitialized. Here are the steps to test this patch (you would need the patch from SOLR-7402 along with this patch to actually see this work, as this is just the framework): 1. Start zk. 2. Using zk CLI, set /security.json {"authorization":{"class":"solr.SimpleSolrAuthorizationPlugin","deny":["user1","user2"]}} Change the userlist to whatever you want to use to test. 3. Start SolrCloud, make sure you use the zk server/ensemble you just setup. e.g. bin/solr start -z localhost:2181 -c To write your own plugin: Extend SolrAuthorizationPlugin, this requires implementing init(PluginInfo info) : This is called when the node comes up. It is called once and creates the plugin object. This is where you can define your connections to 3rd party security systems etc. authorize(SolrRequestContext context) and optionally overriding Follow the steps above (testing this patch) and specify the FQN of the implementation class in the authorization{{class}}. e.g. { "authorization" :{ "class" : "solr.MyCustomSolrAuthorizationPlugin" , "connect_string" : "hostname:port/somepath" }} Still todo: Work on the context object. Cleanup/shutdown hook. Reloading/watching for changes in the /security.json file or documenting how to change it. Tests
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          Quick thought:

          +      //Initialize the Authorization module
          +      if(cores.getZkController().getZkClient().exists(SOLR_SECURITY_CONF_PATH, true)) {
          +        byte[] data = cores.getZkController().getZkClient()
          +            .getData(SOLR_SECURITY_CONF_PATH, null, new Stat(), true);
          +        Map securityConf = (Map) ZkStateReader.fromJSON(data) ;
          +        Map authorizationConf = (Map) securityConf.get("authorization");
          +        log.info("Initializing authorization plugin: " + authorizationConf.get("class"));
          +        authorizationPlugin = cores.getResourceLoader().newInstance((String) authorizationConf.get("class"), 
          +            SolrAuthorizationPlugin.class);
          

          Maybe we should move this to the ZkStateReader, so that we can do something like this here:

          zkcontroller.getSecurityProps().getAuthorizationProps() or zkcontroller.getSecurityProps().getAuthenticationProps()
          
          Show
          Ishan Chattopadhyaya added a comment - - edited Quick thought: + //Initialize the Authorization module + if(cores.getZkController().getZkClient().exists(SOLR_SECURITY_CONF_PATH, true)) { + byte[] data = cores.getZkController().getZkClient() + .getData(SOLR_SECURITY_CONF_PATH, null, new Stat(), true); + Map securityConf = (Map) ZkStateReader.fromJSON(data) ; + Map authorizationConf = (Map) securityConf.get("authorization"); + log.info("Initializing authorization plugin: " + authorizationConf.get("class")); + authorizationPlugin = cores.getResourceLoader().newInstance((String) authorizationConf.get("class"), + SolrAuthorizationPlugin.class); Maybe we should move this to the ZkStateReader, so that we can do something like this here: zkcontroller.getSecurityProps().getAuthorizationProps() or zkcontroller.getSecurityProps().getAuthenticationProps()
          Hide
          Noble Paul added a comment -

          Use the MapInitializedPlugin instead of PluginInfoInitialized . PluginInfo is created with xml in mind. For a component that is exclusively loaded from JSON , it does not make sense to use PluginInfo

          Show
          Noble Paul added a comment - Use the MapInitializedPlugin instead of PluginInfoInitialized . PluginInfo is created with xml in mind. For a component that is exclusively loaded from JSON , it does not make sense to use PluginInfo
          Hide
          Anshum Gupta added a comment -

          I was trying to use the MapInitializedPlugin but that involves changing MapPluginLoader.init(..) which forces me to change DOMUtil.toMapExcept. which is used at a lot of places. I'll create a new GenericMapInitializedPlugin and use that instead.
          Thanks for bringing this up.

          Show
          Anshum Gupta added a comment - I was trying to use the MapInitializedPlugin but that involves changing MapPluginLoader.init(..) which forces me to change DOMUtil.toMapExcept . which is used at a lot of places. I'll create a new GenericMapInitializedPlugin and use that instead. Thanks for bringing this up.
          Hide
          Anshum Gupta added a comment -

          Updated patch. This one doesn't implement any Plugin interface but extends Closeable. There wasn't really a need to implement MapInitializedPlugin or any other variant as it wasn't 1. helpful, 2. used by anyone else.

          After speaking to Noble, seems like we're on the path to actually deprecate those.

          Other things that this patch changes:
          1. Moved the authorizationPlugin init to CoreContainer.
          2. Added the close hook for plugins, it's called during coreContainer.shutdown().
          3. Moved the security.json reading to ZkStateReader and made it accessible as a map via a method.

          Show
          Anshum Gupta added a comment - Updated patch. This one doesn't implement any Plugin interface but extends Closeable. There wasn't really a need to implement MapInitializedPlugin or any other variant as it wasn't 1. helpful, 2. used by anyone else. After speaking to Noble, seems like we're on the path to actually deprecate those. Other things that this patch changes: 1. Moved the authorizationPlugin init to CoreContainer. 2. Added the close hook for plugins, it's called during coreContainer.shutdown(). 3. Moved the security.json reading to ZkStateReader and made it accessible as a map via a method.
          Hide
          Anshum Gupta added a comment -

          Moving the refactoring code to it's own issue and linking it from here.

          Show
          Anshum Gupta added a comment - Moving the refactoring code to it's own issue and linking it from here.
          Hide
          Ishan Chattopadhyaya added a comment -
          +      if (!authResponse.isAuthorized()) {
          +        sendError((HttpServletResponse) response, 401, "Unauthorized request");
          

          Usually, a 401 means that the request needs to be repeated with appropriate authentication/authorization headers. 403 means that the authentication/authorization headers that are needed were provided, but still the server refused to fulfill the request.
          I think 403 (Forbidden) is more appropriate here.

          Show
          Ishan Chattopadhyaya added a comment - + if (!authResponse.isAuthorized()) { + sendError((HttpServletResponse) response, 401, "Unauthorized request"); Usually, a 401 means that the request needs to be repeated with appropriate authentication/authorization headers. 403 means that the authentication/authorization headers that are needed were provided, but still the server refused to fulfill the request. I think 403 (Forbidden) is more appropriate here.
          Hide
          Anshum Gupta added a comment -

          From http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html:

          401:
          The client MAY repeat the request with a suitable Authorization header field (section 14.8). If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials.

          403:
          The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity.

          As per what I undersand, I think 401 still makes more sense in this case.

          Show
          Anshum Gupta added a comment - From http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html: 401: The client MAY repeat the request with a suitable Authorization header field (section 14.8). If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. 403: The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. As per what I undersand, I think 401 still makes more sense in this case.
          Hide
          Ishan Chattopadhyaya added a comment -

          From the same source, I can see this for 401:

          The request requires user authentication. The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource.

          So, that means a 401 response must also contain a WWW-Authenticate header with a challenge, which seems inappropriate in this situation. Hence, I still feel 403 is more appropriate.

          Show
          Ishan Chattopadhyaya added a comment - From the same source, I can see this for 401: The request requires user authentication. The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource. So, that means a 401 response must also contain a WWW-Authenticate header with a challenge, which seems inappropriate in this situation. Hence, I still feel 403 is more appropriate.
          Hide
          Don Bosco Durai added a comment -

          Yes, for authorization, we should return 403 (post authentication). 401 is more appropriate for authentication flow

          Show
          Don Bosco Durai added a comment - Yes, for authorization, we should return 403 (post authentication). 401 is more appropriate for authentication flow
          Hide
          Anshum Gupta added a comment -

          Thanks for that input Don and Ishan, I read up a bit more and it certainly now makes more sense to use 403 for authorization and 401 for authentication. I'm changing things a bit to have the SolrAuthrorizationReponse contain an int HttpStatus code and Solr would just return that back if it's not an SC_OK or SC_ACCEPTED. That would allow the authorization plugin to act independently and decide what to return. I'll just put up the patch with that update in a bit.

          Show
          Anshum Gupta added a comment - Thanks for that input Don and Ishan, I read up a bit more and it certainly now makes more sense to use 403 for authorization and 401 for authentication. I'm changing things a bit to have the SolrAuthrorizationReponse contain an int HttpStatus code and Solr would just return that back if it's not an SC_OK or SC_ACCEPTED. That would allow the authorization plugin to act independently and decide what to return. I'll just put up the patch with that update in a bit.
          Hide
          Don Bosco Durai added a comment -

          Anshum Gupta thanks. When you are doing the change, can you give set method to the member attribute or make it public? With the current code, I need to extend the class to return the value.

          public class SolrAuthorizationResponse {
          boolean authorized;

          public boolean isAuthorized()

          { return authorized; }

          }

          Show
          Don Bosco Durai added a comment - Anshum Gupta thanks. When you are doing the change, can you give set method to the member attribute or make it public? With the current code, I need to extend the class to return the value. public class SolrAuthorizationResponse { boolean authorized; public boolean isAuthorized() { return authorized; } }
          Hide
          Anshum Gupta added a comment -

          Updated patch. This doesn't incorporate the context bit as that depends on committing of SOLR-7484 which I plan to commit in a bit.
          It changes the public SolrAuthorizationResponse and also how the statusCode set for the SolrAuthorizationResponse impacts the processing in SDF.

          Show
          Anshum Gupta added a comment - Updated patch. This doesn't incorporate the context bit as that depends on committing of SOLR-7484 which I plan to commit in a bit. It changes the public SolrAuthorizationResponse and also how the statusCode set for the SolrAuthorizationResponse impacts the processing in SDF.
          Hide
          Anshum Gupta added a comment -

          Updated patch. This doesn't incorporate the context bit as that depends on committing of SOLR-7484 which I plan to commit in a bit.
          It changes the public SolrAuthorizationResponse and also how the statusCode set for the SolrAuthorizationResponse impacts the processing in SDF.

          The last patch was wrong and really meant for SOLR-7484. I'll just delete that from here.

          I'm just wrapping up the patch to work with the current trunk after the SDF refactoring too.

          Show
          Anshum Gupta added a comment - Updated patch. This doesn't incorporate the context bit as that depends on committing of SOLR-7484 which I plan to commit in a bit. It changes the public SolrAuthorizationResponse and also how the statusCode set for the SolrAuthorizationResponse impacts the processing in SDF. The last patch was wrong and really meant for SOLR-7484 . I'll just delete that from here. I'm just wrapping up the patch to work with the current trunk after the SDF refactoring too.
          Hide
          Anshum Gupta added a comment - - edited

          Patch updated to trunk. Working on integrating the context object.

          Show
          Anshum Gupta added a comment - - edited Patch updated to trunk. Working on integrating the context object.
          Hide
          Anshum Gupta added a comment -

          Right now, this also lacks a mechanism to Reload / Reinit without restarting the node. Perhaps it'd be a good idea to have an API to do that.

          Show
          Anshum Gupta added a comment - Right now, this also lacks a mechanism to Reload / Reinit without restarting the node. Perhaps it'd be a good idea to have an API to do that.
          Hide
          Anshum Gupta added a comment -

          Need to clean up things here but this patch now returns some context e.g. userPrincipal, request header and collection request list. Working on adding more things and tests.

          Show
          Anshum Gupta added a comment - Need to clean up things here but this patch now returns some context e.g. userPrincipal, request header and collection request list. Working on adding more things and tests.
          Hide
          Anshum Gupta added a comment -

          Updated patch that returns a request type and the list of collections involved in the request. It also works for aliases i.e. returns the collection list instead of alias name.
          The request type right now is left to unknown for cases other than ADMIN requests. In case of admin requests, it's set to 'ADMIN'.

          Show
          Anshum Gupta added a comment - Updated patch that returns a request type and the list of collections involved in the request. It also works for aliases i.e. returns the collection list instead of alias name. The request type right now is left to unknown for cases other than ADMIN requests. In case of admin requests, it's set to 'ADMIN'.
          Hide
          Anshum Gupta added a comment -

          This patch also extracts collection name in case of a call targeted towards a core e.g.:
          http://hostname:port/solr/collection1_shard1_replica1/select?q=*:*

          The collection info in the context would contain the name of the collection that the core collection1_shard1_replica1 belongs to, assuming it's not actually the name of a collection.

          Show
          Anshum Gupta added a comment - This patch also extracts collection name in case of a call targeted towards a core e.g.: http://hostname:port/solr/collection1_shard1_replica1/select?q=*:* The collection info in the context would contain the name of the collection that the core collection1_shard1_replica1 belongs to, assuming it's not actually the name of a collection.
          Hide
          Anshum Gupta added a comment -

          Changes visibility and cleans up code, removing unused methods.
          I've also changed all fields in SolrRequestContext to be final as they are initialized in the constructor and shouldn't be changed, in the process also getting rid of all the setters.

          Show
          Anshum Gupta added a comment - Changes visibility and cleans up code, removing unused methods. I've also changed all fields in SolrRequestContext to be final as they are initialized in the constructor and shouldn't be changed, in the process also getting rid of all the setters.
          Hide
          Anshum Gupta added a comment - - edited

          The request context now provides Collection name in

          1. Regular /select etc. requests
          2. Collection API requests
          3. Core specific requests.
          4. Collection Alias requests

          Working on adding a test. Will add it tonight or tomorrow and should be done with this.

          Show
          Anshum Gupta added a comment - - edited The request context now provides Collection name in Regular /select etc. requests Collection API requests Core specific requests. Collection Alias requests Working on adding a test. Will add it tonight or tomorrow and should be done with this.
          Hide
          Noble Paul added a comment -

          some cleanup and simplification

          Show
          Noble Paul added a comment - some cleanup and simplification
          Hide
          Anshum Gupta added a comment -

          Thanks for looking at this Noble. I've fixed a few things in this patch. Functionally, this patch also extracts collection list in case of a /select request that looks like:

          /solr/collectionname/select?q=foo:bar&collection=anothercollection,yetanothercollection

          Show
          Anshum Gupta added a comment - Thanks for looking at this Noble. I've fixed a few things in this patch. Functionally, this patch also extracts collection list in case of a /select request that looks like: /solr/collectionname/select?q=foo:bar&collection=anothercollection,yetanothercollection
          Hide
          Anshum Gupta added a comment -

          Adding back a way to get the 1. Resource 2. RequestType from the context. Seems like it was removed in an update a couple of patches back.

          Show
          Anshum Gupta added a comment - Adding back a way to get the 1. Resource 2. RequestType from the context. Seems like it was removed in an update a couple of patches back.
          Hide
          Anshum Gupta added a comment -

          Patch that adds request type info for /select [READ] and /update [WRITE] requests.

          Show
          Anshum Gupta added a comment - Patch that adds request type info for /select [READ] and /update [WRITE] requests.
          Hide
          Anshum Gupta added a comment -

          This patch filters out authz and context creation for *.png and *.html requests.
          There were a lot of those coming in for the new Admin UI.

          Show
          Anshum Gupta added a comment - This patch filters out authz and context creation for *.png and *.html requests. There were a lot of those coming in for the new Admin UI.
          Hide
          Anshum Gupta added a comment - - edited

          Patch with test. I think this is good to go now.
          Any feedback would be appreciated.

          Show
          Anshum Gupta added a comment - - edited Patch with test. I think this is good to go now. Any feedback would be appreciated.
          Hide
          Anshum Gupta added a comment -

          Accidentally added the SimpleSolrAuthorizationPlugin to the last patch. Removing it. Also added 2 more static file extensions to ignore for authz purpose.

          Show
          Anshum Gupta added a comment - Accidentally added the SimpleSolrAuthorizationPlugin to the last patch. Removing it. Also added 2 more static file extensions to ignore for authz purpose.
          Hide
          Noble Paul added a comment -

          We need to tackle the modification of security.json pretty soon. But that can be dealt separately

          The security.json needs to be watched and the plugin needs to be notified of the change. That should not prevent us from committing this

          Show
          Noble Paul added a comment - We need to tackle the modification of security.json pretty soon. But that can be dealt separately The security.json needs to be watched and the plugin needs to be notified of the change. That should not prevent us from committing this
          Hide
          Anshum Gupta added a comment -

          Thanks for the feedback Noble.

          Right, as of now, a node restart would be required for security.json to be re-read. I'll create another issue for that and as I understand, you don't have an objection to committing this, right?

          Show
          Anshum Gupta added a comment - Thanks for the feedback Noble. Right, as of now, a node restart would be required for security.json to be re-read. I'll create another issue for that and as I understand, you don't have an objection to committing this, right?
          Hide
          ASF subversion and git services added a comment -

          Commit 1679316 from Anshum Gupta in branch 'dev/trunk'
          [ https://svn.apache.org/r1679316 ]

          SOLR-7275: Authorization framework for Solr. It defines an interface and a mechanism to create, load and use an Authorization plugin.

          Show
          ASF subversion and git services added a comment - Commit 1679316 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1679316 ] SOLR-7275 : Authorization framework for Solr. It defines an interface and a mechanism to create, load and use an Authorization plugin.
          Hide
          ASF subversion and git services added a comment -

          Commit 1679317 from Anshum Gupta in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1679317 ]

          SOLR-7275: Authorization framework for Solr. It defines an interface and a mechanism to create, load and use an Authorization plugin.(merge from trunk)

          Show
          ASF subversion and git services added a comment - Commit 1679317 from Anshum Gupta in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1679317 ] SOLR-7275 : Authorization framework for Solr. It defines an interface and a mechanism to create, load and use an Authorization plugin.(merge from trunk)
          Hide
          ASF subversion and git services added a comment -

          Commit 1679319 from Noble Paul in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1679319 ]

          SOLR-7275: compile error

          Show
          ASF subversion and git services added a comment - Commit 1679319 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1679319 ] SOLR-7275 : compile error
          Hide
          Anshum Gupta added a comment -

          Damn! Thanks for fixing this. Just discovered I ran the test using Java8.

          Show
          Anshum Gupta added a comment - Damn! Thanks for fixing this. Just discovered I ran the test using Java8.
          Hide
          ASF subversion and git services added a comment -

          Commit 1679497 from Anshum Gupta in branch 'dev/trunk'
          [ https://svn.apache.org/r1679497 ]

          SOLR-7275: Setting requestType for context object in case of a /get request

          Show
          ASF subversion and git services added a comment - Commit 1679497 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1679497 ] SOLR-7275 : Setting requestType for context object in case of a /get request
          Hide
          ASF subversion and git services added a comment -

          Commit 1679604 from Anshum Gupta in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1679604 ]

          SOLR-7275: Setting requestType for context object in case of a /get request(merge from trunk)

          Show
          ASF subversion and git services added a comment - Commit 1679604 from Anshum Gupta in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1679604 ] SOLR-7275 : Setting requestType for context object in case of a /get request(merge from trunk)
          Hide
          Anshum Gupta added a comment -

          Bulk close for 5.2.0.

          Show
          Anshum Gupta added a comment - Bulk close for 5.2.0.

            People

            • Assignee:
              Anshum Gupta
              Reporter:
              Anshum Gupta
            • Votes:
              2 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development