Solr
  1. Solr
  2. SOLR-7789

Introduce a ConfigSet management API at /admin/configs

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.4, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      SOLR-5955 describes a feature to automatically create a ConfigSet, based on another one, from a collection API call (i.e. one step collection creation). Discussion there yielded SOLR-7742, Immutable ConfigSet support. To close the loop, we need support for a ConfigSet management API.

      The simplest ConfigSet API could have one operation:
      create a new config set, based on an existing one, possible modifying the ConfigSet properties. Note you need to be able to modify the ConfigSet properties at creation time because otherwise Immutable could not be changed.

      Another logical operation to support is ConfigSet deletion; that may be more complicated to implement than creation because you need to handle the case where a collection is already using the configuration.

      1. SOLR-7789.patch
        257 kB
        Gregory Chanan
      2. SOLR-7789.patch
        214 kB
        Gregory Chanan
      3. SOLR-7789.patch
        181 kB
        Gregory Chanan
      4. SOLR-7789.patch
        186 kB
        Gregory Chanan
      5. SOLR-7789.patch
        186 kB
        Gregory Chanan

        Issue Links

          Activity

          Hide
          Gregory Chanan added a comment -

          Here is a patch that implements a ConfigSets API for SolrCloud with two entry points:

          • CREATE: Create a ConfigSet, based on an existing ConfigSet
            Parameters:
            Key Type Required Default Description
            name String Yes   ConfigSet to be created
            baseConfigSet String Yes   ConfigSet to copy as a base
            configSetProp.name=value String No   ConfigSet property from base to override
          • DELETE: Delete a ConfigSet
            Parameters:
            Key Type Required Default Description
            name String Yes   ConfigSet to be deleted

          The implementation is based on the Collections API and in fact uses the same OverseerProcessor at this time. The handling code between the ConfigSets API and the CollectionsAPI is completely separate so it should be easy to move to its own OverseerProcessor in the future. Using one OverseerProcessor seemed reasonable given the above operations should be fast.

          Some notes:
          1) DELETE does not check that a ConfigSet is not referenced. This is equivalent to the existing method of deleting a ConfigSet via the zkcli. The exception to this rule is with task exclusivity: a ConfigSet deletion won't go through if there is an on going operation referencing that ConfigSet (e.g. as a Base ConfigSet in a create). There is a bit of complexity in implementing that efficiently (simulating a read/write lock for exclusivity so we don't block on multiple ConfigSets being reated with the same Base ConfigSet).

          2) The name for the ConfigSet properties file is always assumed to be "configsetprops.json". This is necessary to handle deletions correctly; if we allowed the user to specify in the API call what the file was called they could point at another or non-existing file to be able to avoid the immutable check. This is a little strange because the CoreDescriptor lets you override the name of this file, but I guess this is equivalent to solrconfig.xml and schema.xml being special in ZK even though they can also be overriden in the CoreDescriptor.

          3) The handling of the Overseer.getConfigSetQueue isn't as nice as I would like. Because we use the same OverseerProcessor, we need some way of figuring out which MessageHandler to dispatch to; we do this by prefixing the action in ZK with "configsets:". Both the ConfigSetsHandler and the OverseerConfigSetsMessageHandler need to be aware of this. Ideally, the queue returned from Overseer.getConfigSetQueue would just handle this when an item was inserted into the queue, so only the Overseer has to worry about how the messages are actually handled. This was difficult to do given the DistributedQueue API is just byte arrays. I think the correct thing to do here is change at least the offer API to be JSON based – all the non-test calls already immediately convert JSON to bytes anyway – but this seemed like a larger change for this patch.

          Some testing notes:
          T1) I wanted to be able to test the case where the CREATE failed half way through, so we end up with partial state in ZK that needed to be cleaned up. I found the easiest way to do this was to override the ZKDatabase that the ZKTestServer uses. So, you can now get/set the ZKDatabase in the ZKTestServer and pass your own ZKTestServer to the MiniSolrCloudCluster.

          T2) There's also an "exclusivity" test to ensure that actions on the same ConfigSet don't trample each other. This is done by concurrently doing DELETEs/CREATEs and ensuring the exceptions indicate that the operations ran sequentially.

          Show
          Gregory Chanan added a comment - Here is a patch that implements a ConfigSets API for SolrCloud with two entry points: CREATE: Create a ConfigSet, based on an existing ConfigSet Parameters: Key Type Required Default Description name String Yes   ConfigSet to be created baseConfigSet String Yes   ConfigSet to copy as a base configSetProp. name=value String No   ConfigSet property from base to override DELETE: Delete a ConfigSet Parameters: Key Type Required Default Description name String Yes   ConfigSet to be deleted The implementation is based on the Collections API and in fact uses the same OverseerProcessor at this time. The handling code between the ConfigSets API and the CollectionsAPI is completely separate so it should be easy to move to its own OverseerProcessor in the future. Using one OverseerProcessor seemed reasonable given the above operations should be fast. Some notes: 1) DELETE does not check that a ConfigSet is not referenced. This is equivalent to the existing method of deleting a ConfigSet via the zkcli. The exception to this rule is with task exclusivity: a ConfigSet deletion won't go through if there is an on going operation referencing that ConfigSet (e.g. as a Base ConfigSet in a create). There is a bit of complexity in implementing that efficiently (simulating a read/write lock for exclusivity so we don't block on multiple ConfigSets being reated with the same Base ConfigSet). 2) The name for the ConfigSet properties file is always assumed to be "configsetprops.json". This is necessary to handle deletions correctly; if we allowed the user to specify in the API call what the file was called they could point at another or non-existing file to be able to avoid the immutable check. This is a little strange because the CoreDescriptor lets you override the name of this file, but I guess this is equivalent to solrconfig.xml and schema.xml being special in ZK even though they can also be overriden in the CoreDescriptor. 3) The handling of the Overseer.getConfigSetQueue isn't as nice as I would like. Because we use the same OverseerProcessor, we need some way of figuring out which MessageHandler to dispatch to; we do this by prefixing the action in ZK with "configsets:". Both the ConfigSetsHandler and the OverseerConfigSetsMessageHandler need to be aware of this. Ideally, the queue returned from Overseer.getConfigSetQueue would just handle this when an item was inserted into the queue, so only the Overseer has to worry about how the messages are actually handled. This was difficult to do given the DistributedQueue API is just byte arrays. I think the correct thing to do here is change at least the offer API to be JSON based – all the non-test calls already immediately convert JSON to bytes anyway – but this seemed like a larger change for this patch. Some testing notes: T1) I wanted to be able to test the case where the CREATE failed half way through, so we end up with partial state in ZK that needed to be cleaned up. I found the easiest way to do this was to override the ZKDatabase that the ZKTestServer uses. So, you can now get/set the ZKDatabase in the ZKTestServer and pass your own ZKTestServer to the MiniSolrCloudCluster. T2) There's also an "exclusivity" test to ensure that actions on the same ConfigSet don't trample each other. This is done by concurrently doing DELETEs/CREATEs and ensuring the exceptions indicate that the operations ran sequentially.
          Hide
          Mark Miller added a comment -

          I'm having a little trouble applying the latest patch because it's out of date.

          Do you know what svn revision it is based on?

          Show
          Mark Miller added a comment - I'm having a little trouble applying the latest patch because it's out of date. Do you know what svn revision it is based on?
          Hide
          Gregory Chanan added a comment -

          Here's a rebased version of the patch on top of
          git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1696834 13f79535-47bb-0310-9956-ffa450edef68

          Show
          Gregory Chanan added a comment - Here's a rebased version of the patch on top of git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1696834 13f79535-47bb-0310-9956-ffa450edef68
          Hide
          Mark Miller added a comment -

          I still cant apply this patch cleanly against 1696834. Do you have an svn version of the patch?

          Show
          Mark Miller added a comment - I still cant apply this patch cleanly against 1696834. Do you have an svn version of the patch?
          Hide
          Gregory Chanan added a comment -

          Sorry about that, I don't know what the issue is. I'll generate an svn version.

          I'm not sure how much we care about this, but one issue I realized with the patch is that it is a little aggresive about cleaning up failed CREATE attempts. If someone concurrently is using the zkcli to add a config and using CREATE via the API (so exclusivity checking doesn't apply), the following can happen:

          • CREATE call checks that config doesn't exist
          • zkcli adds config
          • CREATE tries to create and fails
          • CREATE removes config as part of failure cleanup

          First, we should recommend that people don't use the zkcli and the ConfigSET API concurrently (I would argue people shouldn't use zkcli at all). But we could be a little smarter about this case, e.g. track if the CREATE call actually wrote anything and only clean up if something was actually written.

          Show
          Gregory Chanan added a comment - Sorry about that, I don't know what the issue is. I'll generate an svn version. I'm not sure how much we care about this, but one issue I realized with the patch is that it is a little aggresive about cleaning up failed CREATE attempts. If someone concurrently is using the zkcli to add a config and using CREATE via the API (so exclusivity checking doesn't apply), the following can happen: CREATE call checks that config doesn't exist zkcli adds config CREATE tries to create and fails CREATE removes config as part of failure cleanup First, we should recommend that people don't use the zkcli and the ConfigSET API concurrently (I would argue people shouldn't use zkcli at all). But we could be a little smarter about this case, e.g. track if the CREATE call actually wrote anything and only clean up if something was actually written.
          Hide
          Gregory Chanan added a comment -

          Generated an svn patch, let me know if this works, Mark.

          Show
          Gregory Chanan added a comment - Generated an svn patch, let me know if this works, Mark.
          Hide
          Mark Miller added a comment -

          Clean apply, thanks!

          Show
          Mark Miller added a comment - Clean apply, thanks!
          Hide
          Gregory Chanan added a comment -

          Looks like there are some small semantic conflicts with SOLR-6760, I'm working to address.

          Show
          Gregory Chanan added a comment - Looks like there are some small semantic conflicts with SOLR-6760 , I'm working to address.
          Show
          Gregory Chanan added a comment - Commented in SOLR-6760 with a plan for how to address: https://issues.apache.org/jira/browse/SOLR-6760?focusedCommentId=14707457&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14707457
          Hide
          Gregory Chanan added a comment -

          attached a new version of the patch with two changes:
          1) renamed OverseerCollectionQueue to OverseerTaskQueue, as discussed in SOLR-6760
          2) only cleans up CREATE if the CREATE call actually wrote something to ZK

          Show
          Gregory Chanan added a comment - attached a new version of the patch with two changes: 1) renamed OverseerCollectionQueue to OverseerTaskQueue, as discussed in SOLR-6760 2) only cleans up CREATE if the CREATE call actually wrote something to ZK
          Hide
          Shalin Shekhar Mangar added a comment -

          I am trying to accomplish the following:
          #1 Add another OverseerMessageHandler (OverseerConfigSetMessageHandler) to handle ConfigSet-related operations.
          #2 From the perspective of the non-overseer (i.e. the ConfigSetsHandler), it looks like the operations are written to a separate queue from the collection queue, i.e. getOverseerConfigSetQueue()
          #3 Since the ConfigSet operations are most likely rare and fast, it made sense to just use the existing collections queue "under the covers" and handle the dispatch separately. The naming here breaks the illusion of #2, i.e. if I return an OverseerCollectionQueue it's pretty obvious to the non-overseer what's going on.

          Why should there be a separate queue for such operations? i.e. Why should ZkController have method called getOverseerConfigSetQueue() at all? It should just use the collection queue. Similarily why does this API need a new ConfigSetsHandler and not use CollectionsHandler?

          Short term: rename OverseerCollectionQueue to something more generic...DistributedTaskQueue? DistributedAsyncAwareQueue? There's nothing in there that is actually collection specific (which is why it works for the ConfigSet operations)

          The only reason why it is called a OverseerCollectionQueue is to indicate that it is the queue for OverseerCollectionProcessor.

          TBH, all this refactoring of OverseerCollectionProcessor confuses me very much. It looks like you want the APIs and tasks run in the overseer to be pluggable but I haven't noticed you saying that anywhere. In the absence of them being truly pluggable, the current API has become more complicated than it was before. We do not need complex dispatch mechanisms in the collection processor. If you think that it is wrong to perform tasks that do not involve a Collection then it could simply be renamed to OverseerTaskProcessor and we can be done. Btw we have had APIs such as clusterprops and request_status in the overseer collection processor and I don't think it is confusing anyone.

          Show
          Shalin Shekhar Mangar added a comment - I am trying to accomplish the following: #1 Add another OverseerMessageHandler (OverseerConfigSetMessageHandler) to handle ConfigSet-related operations. #2 From the perspective of the non-overseer (i.e. the ConfigSetsHandler), it looks like the operations are written to a separate queue from the collection queue, i.e. getOverseerConfigSetQueue() #3 Since the ConfigSet operations are most likely rare and fast, it made sense to just use the existing collections queue "under the covers" and handle the dispatch separately. The naming here breaks the illusion of #2, i.e. if I return an OverseerCollectionQueue it's pretty obvious to the non-overseer what's going on. Why should there be a separate queue for such operations? i.e. Why should ZkController have method called getOverseerConfigSetQueue() at all? It should just use the collection queue. Similarily why does this API need a new ConfigSetsHandler and not use CollectionsHandler? Short term: rename OverseerCollectionQueue to something more generic...DistributedTaskQueue? DistributedAsyncAwareQueue? There's nothing in there that is actually collection specific (which is why it works for the ConfigSet operations) The only reason why it is called a OverseerCollectionQueue is to indicate that it is the queue for OverseerCollectionProcessor. TBH, all this refactoring of OverseerCollectionProcessor confuses me very much. It looks like you want the APIs and tasks run in the overseer to be pluggable but I haven't noticed you saying that anywhere. In the absence of them being truly pluggable, the current API has become more complicated than it was before. We do not need complex dispatch mechanisms in the collection processor. If you think that it is wrong to perform tasks that do not involve a Collection then it could simply be renamed to OverseerTaskProcessor and we can be done. Btw we have had APIs such as clusterprops and request_status in the overseer collection processor and I don't think it is confusing anyone.
          Hide
          Gregory Chanan added a comment -

          Thanks for the comments, Shalin.

          Why should there be a separate queue for such operations? i.e. Why should ZkController have method called getOverseerConfigSetQueue() at all? It should just use the collection queue. Similarily why does this API need a new ConfigSetsHandler and not use CollectionsHandler?

          Let's start with the second question. There is a ConfigSetsHandler because it operates on a separate end point – it doesn't make sense to send ConfigSet-related commands to /admin/collections, it makes more sense from the end user perspective to send ConfigSet-related commands to /admin/configs. Given separate end points, separate handlers also make sense.

          On the second question, the concern is a little more subtle. The point I am trying to make is that how the Overseer processes ConfigSet operations is an implementation detail of the Overseer. The ConfigSetHandler (which is not part of the Overseer) just needs an interface in order to tell the Overseer that it wants a ConfigSet operation processed, it shouldn't be concerned with the implementation details. Right now we happen to use the same queue under the covers, but maybe we find in the future that's a bad idea (e.g. users have different QoS expectations between ConfigSet and Collection operations and we add ConfigSet operations that are long lived and block important Collection operations). If the interface between the Overseer and the ConfigSet handler doesn't refer to collections, we don't need to change anything outside of the Overseer if we change the processing in the future.

          The only reason why it is called a OverseerCollectionQueue is to indicate that it is the queue for OverseerCollectionProcessor.

          That can be indicated with a variable/method name, not the type name.

          TBH, all this refactoring of OverseerCollectionProcessor confuses me very much. It looks like you want the APIs and tasks run in the overseer to be pluggable but I haven't noticed you saying that anywhere. In the absence of them being truly pluggable, the current API has become more complicated than it was before. We do not need complex dispatch mechanisms in the collection processor.

          I don't wish to make the API/tasks pluggable so I understand your concern. That being said, there is a middle ground between API/tasks being pluggable and putting everything in the collection processor. All I'm arguing for is clean interfaces. Take SOLR-7855 as an example; because we had Overseer/role related commands in the collection processor it made refactoring much more difficult. Doing what you suggest in my opinion would have the same effect.

          We do not need complex dispatch mechanisms in the collection processor. If you think that it is wrong to perform tasks that do not involve a Collection then it could simply be renamed to OverseerTaskProcessor and we can be done.

          I don't think the dispatching here is complex and it is completely contained in the Overseer. I'm not sure what you mean by OverseerTaskProcessor, this seems like a separate issue. After SOLR-7855 we have an OverseerCollectionMessageHandler to handle overseer collection messages. If you are suggesting throwing the ConfigSet related commands in there (from OverseerConfigSetMessageHandler), you've just moved the dispatch code somewhere else.

          Btw we have had APIs such as clusterprops and request_status in the overseer collection processor and I don't think it is confusing anyone.

          I gave the example above the overseer/role related commands making code hard to refactor. I agree with you that clusterprops and request_status aren't particularly confusing – that doesn't mean we can't do better.

          Show
          Gregory Chanan added a comment - Thanks for the comments, Shalin. Why should there be a separate queue for such operations? i.e. Why should ZkController have method called getOverseerConfigSetQueue() at all? It should just use the collection queue. Similarily why does this API need a new ConfigSetsHandler and not use CollectionsHandler? Let's start with the second question. There is a ConfigSetsHandler because it operates on a separate end point – it doesn't make sense to send ConfigSet-related commands to /admin/collections, it makes more sense from the end user perspective to send ConfigSet-related commands to /admin/configs. Given separate end points, separate handlers also make sense. On the second question, the concern is a little more subtle. The point I am trying to make is that how the Overseer processes ConfigSet operations is an implementation detail of the Overseer. The ConfigSetHandler (which is not part of the Overseer) just needs an interface in order to tell the Overseer that it wants a ConfigSet operation processed, it shouldn't be concerned with the implementation details. Right now we happen to use the same queue under the covers, but maybe we find in the future that's a bad idea (e.g. users have different QoS expectations between ConfigSet and Collection operations and we add ConfigSet operations that are long lived and block important Collection operations). If the interface between the Overseer and the ConfigSet handler doesn't refer to collections, we don't need to change anything outside of the Overseer if we change the processing in the future. The only reason why it is called a OverseerCollectionQueue is to indicate that it is the queue for OverseerCollectionProcessor. That can be indicated with a variable/method name, not the type name. TBH, all this refactoring of OverseerCollectionProcessor confuses me very much. It looks like you want the APIs and tasks run in the overseer to be pluggable but I haven't noticed you saying that anywhere. In the absence of them being truly pluggable, the current API has become more complicated than it was before. We do not need complex dispatch mechanisms in the collection processor. I don't wish to make the API/tasks pluggable so I understand your concern. That being said, there is a middle ground between API/tasks being pluggable and putting everything in the collection processor. All I'm arguing for is clean interfaces. Take SOLR-7855 as an example; because we had Overseer/role related commands in the collection processor it made refactoring much more difficult. Doing what you suggest in my opinion would have the same effect. We do not need complex dispatch mechanisms in the collection processor. If you think that it is wrong to perform tasks that do not involve a Collection then it could simply be renamed to OverseerTaskProcessor and we can be done. I don't think the dispatching here is complex and it is completely contained in the Overseer. I'm not sure what you mean by OverseerTaskProcessor, this seems like a separate issue. After SOLR-7855 we have an OverseerCollectionMessageHandler to handle overseer collection messages. If you are suggesting throwing the ConfigSet related commands in there (from OverseerConfigSetMessageHandler), you've just moved the dispatch code somewhere else. Btw we have had APIs such as clusterprops and request_status in the overseer collection processor and I don't think it is confusing anyone. I gave the example above the overseer/role related commands making code hard to refactor. I agree with you that clusterprops and request_status aren't particularly confusing – that doesn't mean we can't do better.
          Hide
          Shalin Shekhar Mangar added a comment -

          There is a ConfigSetsHandler because it operates on a separate end point – it doesn't make sense to send ConfigSet-related commands to /admin/collections, it makes more sense from the end user perspective to send ConfigSet-related commands to /admin/configs. Given separate end points, separate handlers also make sense.

          Okay, that is reasonable.

          The point I am trying to make is that how the Overseer processes ConfigSet operations is an implementation detail of the Overseer. The ConfigSetHandler (which is not part of the Overseer) just needs an interface in order to tell the Overseer that it wants a ConfigSet operation processed, it shouldn't be concerned with the implementation details.

          In my mind, it makes the opposite impression. The fact that we have a ZkController.getOverseerCollectionQueue and a different ZkController.getOverseerConfigSetQueue() suggests that the two queues are different but they are not, at least not yet. So why does this feature try to suggest that their implementation might be different at all? These things are easily changed so lets refactor it when we actually need to have different queues.

          Right now we happen to use the same queue under the covers, but maybe we find in the future that's a bad idea (e.g. users have different QoS expectations between ConfigSet and Collection operations and we add ConfigSet operations that are long lived and block important Collection operations). If the interface between the Overseer and the ConfigSet handler doesn't refer to collections, we don't need to change anything outside of the Overseer if we change the processing in the future.

          There are already different QoS expectations within the existing operations. For example, operations for different collections never block each other and operations such as cluster status, overseer status and request status never block. However, they are all managed by the same overseer and it can continue to be the case. Yes, what operations block what is not formally defined or enforced, which is something that can use some love.

          The only reason why it is called a OverseerCollectionQueue is to indicate that it is the queue for OverseerCollectionProcessor.

          That can be indicated with a variable/method name, not the type name.

          Sure it could be if it is a generic queue but it is only used for the overseer collection processor and I don't see the need for another queue in Solr right now.

          I don't wish to make the API/tasks pluggable so I understand your concern. That being said, there is a middle ground between API/tasks being pluggable and putting everything in the collection processor. All I'm arguing for is clean interfaces. Take SOLR-7855 as an example; because we had Overseer/role related commands in the collection processor it made refactoring much more difficult. Doing what you suggest in my opinion would have the same effect

          I understand what you are saying. I did the same for Overseer in SOLR-6554 which grouped all related operations and moved them into their own classes (ClusterStateMutator, SliceMutator etc). In fact, I'd argue that SOLR-7855 didn't go far enough – it'd be great to have individual operations completely separate from the message processor such that they can be easily unit tested. I am very much on board with that.

          I'm just a bit confused why we have an interface if we have just one implementation (YAGNI!) e.g. OverseerMessageHandler and OverseerMessageHandlerSelector. Similarily, OverseerCollectionProcessor doesn't add much over OverseerProcessor except for the static getOverseerMessageHandlerSelector method.

          I don't think the dispatching here is complex and it is completely contained in the Overseer. After SOLR-7855 we have an OverseerCollectionMessageHandler to handle overseer collection messages. If you are suggesting throwing the ConfigSet related commands in there (from OverseerConfigSetMessageHandler), you've just moved the dispatch code somewhere else.

          I was referring to the OverseerMessageHandlerSelector actually. I assumed that you foresee more than one implementation in the future which would make the dispatching more complex, hence the comment. So to dispatch a request, at level 1, you have the OverseerMessageHandlerSelector and at level 2, you have an OverseerMessageHandler and at level 3, you have the processMessage inside the OverseerMessageHandler which sends the request to the right handler method. This is the complexity that I was referring to. Perhaps, we can get rid of OverseerMessageHandlerSelector?

          Sorry for the vague comment earlier. I don't want to block you anymore than I already have. We can always refactor this in future. Thank you for cleaning up the mess in OverseerCollectionProcessor!

          Show
          Shalin Shekhar Mangar added a comment - There is a ConfigSetsHandler because it operates on a separate end point – it doesn't make sense to send ConfigSet-related commands to /admin/collections, it makes more sense from the end user perspective to send ConfigSet-related commands to /admin/configs. Given separate end points, separate handlers also make sense. Okay, that is reasonable. The point I am trying to make is that how the Overseer processes ConfigSet operations is an implementation detail of the Overseer. The ConfigSetHandler (which is not part of the Overseer) just needs an interface in order to tell the Overseer that it wants a ConfigSet operation processed, it shouldn't be concerned with the implementation details. In my mind, it makes the opposite impression. The fact that we have a ZkController.getOverseerCollectionQueue and a different ZkController.getOverseerConfigSetQueue() suggests that the two queues are different but they are not, at least not yet. So why does this feature try to suggest that their implementation might be different at all? These things are easily changed so lets refactor it when we actually need to have different queues. Right now we happen to use the same queue under the covers, but maybe we find in the future that's a bad idea (e.g. users have different QoS expectations between ConfigSet and Collection operations and we add ConfigSet operations that are long lived and block important Collection operations). If the interface between the Overseer and the ConfigSet handler doesn't refer to collections, we don't need to change anything outside of the Overseer if we change the processing in the future. There are already different QoS expectations within the existing operations. For example, operations for different collections never block each other and operations such as cluster status, overseer status and request status never block. However, they are all managed by the same overseer and it can continue to be the case. Yes, what operations block what is not formally defined or enforced, which is something that can use some love. The only reason why it is called a OverseerCollectionQueue is to indicate that it is the queue for OverseerCollectionProcessor. That can be indicated with a variable/method name, not the type name. Sure it could be if it is a generic queue but it is only used for the overseer collection processor and I don't see the need for another queue in Solr right now. I don't wish to make the API/tasks pluggable so I understand your concern. That being said, there is a middle ground between API/tasks being pluggable and putting everything in the collection processor. All I'm arguing for is clean interfaces. Take SOLR-7855 as an example; because we had Overseer/role related commands in the collection processor it made refactoring much more difficult. Doing what you suggest in my opinion would have the same effect I understand what you are saying. I did the same for Overseer in SOLR-6554 which grouped all related operations and moved them into their own classes (ClusterStateMutator, SliceMutator etc). In fact, I'd argue that SOLR-7855 didn't go far enough – it'd be great to have individual operations completely separate from the message processor such that they can be easily unit tested. I am very much on board with that. I'm just a bit confused why we have an interface if we have just one implementation (YAGNI!) e.g. OverseerMessageHandler and OverseerMessageHandlerSelector. Similarily, OverseerCollectionProcessor doesn't add much over OverseerProcessor except for the static getOverseerMessageHandlerSelector method. I don't think the dispatching here is complex and it is completely contained in the Overseer. After SOLR-7855 we have an OverseerCollectionMessageHandler to handle overseer collection messages. If you are suggesting throwing the ConfigSet related commands in there (from OverseerConfigSetMessageHandler), you've just moved the dispatch code somewhere else. I was referring to the OverseerMessageHandlerSelector actually. I assumed that you foresee more than one implementation in the future which would make the dispatching more complex, hence the comment. So to dispatch a request, at level 1, you have the OverseerMessageHandlerSelector and at level 2, you have an OverseerMessageHandler and at level 3, you have the processMessage inside the OverseerMessageHandler which sends the request to the right handler method. This is the complexity that I was referring to. Perhaps, we can get rid of OverseerMessageHandlerSelector? Sorry for the vague comment earlier. I don't want to block you anymore than I already have. We can always refactor this in future. Thank you for cleaning up the mess in OverseerCollectionProcessor!
          Hide
          Gregory Chanan added a comment -

          In my mind, it makes the opposite impression. The fact that we have a ZkController.getOverseerCollectionQueue and a different ZkController.getOverseerConfigSetQueue() suggests that the two queues are different but they are not, at least not yet. So why does this feature try to suggest that their implementation might be different at all? These things are easily changed so lets refactor it when we actually need to have different queues.

          I think part of the difference of opinion here is you are viewing the interface as "this is the collection queue and this is the configset queue" – given that the interface is a queue your line of thinking makes sense, but perhaps having queue as part of the interface is leaking too many implementation details already. I'm viewing the interface as "this is where I send Collection operation requests and this is where I send ConfigSet operation requests." There's no same vs separate queue discussion if that is the interface. If that were the interface, you would be arguing for a single "this is where I send Overseer operation requests."

          To be honest, I don't think this makes a huge difference either way right now. The central issue, imo, is how you the RequestHandler differentiates which messages are intended for which MessageHandler. In the naive way of just throwing everything in the OCP (not saying you are arguing for that), you'd have conflicts with say, the Collection.CREATE and ConfigSet.CREATE and would need to make sure all the names don't conflict (a mess). So, you'd either need to differentiate the message at the Overseer interface level "this is where I send Collection requests and this is where I send ConfigSet operation requests" (this is essentially what I've chosen) or each handler puts enough content in the message so that the overseer can differentiate itself. The later could certainly be the right way to go – I just didn't choose it because 1) it would require changing the existing message format and we'd have to deal with backwards incompatibility 2) we'd need to invent some grouping concept of operations (what are CollectionActions vs ConfigSetActions – groups of actions? action sets?). If we solve #1 and #2 I certainly have no objection to having a single "this is where I send Overseer operation requests" interface.

          There are already different QoS expectations within the existing operations. For example, operations for different collections never block each other and operations such as cluster status, overseer status and request status never block. However, they are all managed by the same overseer and it can continue to be the case. Yes, what operations block what is not formally defined or enforced, which is something that can use some love.

          Sure, that's just a hypothetical. I think what I wrote above is the central issue.

          I understand what you are saying. I did the same for Overseer in SOLR-6554 which grouped all related operations and moved them into their own classes (ClusterStateMutator, SliceMutator etc). In fact, I'd argue that SOLR-7855 didn't go far enough – it'd be great to have individual operations completely separate from the message processor such that they can be easily unit tested. I am very much on board with that.
          I'm just a bit confused why we have an interface if we have just one implementation (YAGNI!) e.g. OverseerMessageHandler and OverseerMessageHandlerSelector. Similarily, OverseerCollectionProcessor doesn't add much over OverseerProcessor except for the static getOverseerMessageHandlerSelector method.

          Well there are two OverseerMessageHandlers now . I see below your concern is mainly with the Selector.

          I was referring to the OverseerMessageHandlerSelector actually. I assumed that you foresee more than one implementation in the future which would make the dispatching more complex, hence the comment. So to dispatch a request, at level 1, you have the OverseerMessageHandlerSelector and at level 2, you have an OverseerMessageHandler and at level 3, you have the processMessage inside the OverseerMessageHandler which sends the request to the right handler method. This is the complexity that I was referring to. Perhaps, we can get rid of OverseerMessageHandlerSelector?

          The OverseerMessageHandlerSelector was really just some scaffolding to help me with the refactor. I don't have an objection to making a single non-interface implementation of it. Or even a canonical implementation if we adopt the "single this is where I send Overseer operation requests" interface (e.g. it could just do some generic logic mapping the expanded info in the message with the set of available message handlers). That should be another jira though.

          Show
          Gregory Chanan added a comment - In my mind, it makes the opposite impression. The fact that we have a ZkController.getOverseerCollectionQueue and a different ZkController.getOverseerConfigSetQueue() suggests that the two queues are different but they are not, at least not yet. So why does this feature try to suggest that their implementation might be different at all? These things are easily changed so lets refactor it when we actually need to have different queues. I think part of the difference of opinion here is you are viewing the interface as "this is the collection queue and this is the configset queue" – given that the interface is a queue your line of thinking makes sense, but perhaps having queue as part of the interface is leaking too many implementation details already. I'm viewing the interface as "this is where I send Collection operation requests and this is where I send ConfigSet operation requests." There's no same vs separate queue discussion if that is the interface. If that were the interface, you would be arguing for a single "this is where I send Overseer operation requests." To be honest, I don't think this makes a huge difference either way right now. The central issue, imo, is how you the RequestHandler differentiates which messages are intended for which MessageHandler. In the naive way of just throwing everything in the OCP (not saying you are arguing for that), you'd have conflicts with say, the Collection.CREATE and ConfigSet.CREATE and would need to make sure all the names don't conflict (a mess). So, you'd either need to differentiate the message at the Overseer interface level "this is where I send Collection requests and this is where I send ConfigSet operation requests" (this is essentially what I've chosen) or each handler puts enough content in the message so that the overseer can differentiate itself. The later could certainly be the right way to go – I just didn't choose it because 1) it would require changing the existing message format and we'd have to deal with backwards incompatibility 2) we'd need to invent some grouping concept of operations (what are CollectionActions vs ConfigSetActions – groups of actions? action sets?). If we solve #1 and #2 I certainly have no objection to having a single "this is where I send Overseer operation requests" interface. There are already different QoS expectations within the existing operations. For example, operations for different collections never block each other and operations such as cluster status, overseer status and request status never block. However, they are all managed by the same overseer and it can continue to be the case. Yes, what operations block what is not formally defined or enforced, which is something that can use some love. Sure, that's just a hypothetical. I think what I wrote above is the central issue. I understand what you are saying. I did the same for Overseer in SOLR-6554 which grouped all related operations and moved them into their own classes (ClusterStateMutator, SliceMutator etc). In fact, I'd argue that SOLR-7855 didn't go far enough – it'd be great to have individual operations completely separate from the message processor such that they can be easily unit tested. I am very much on board with that. I'm just a bit confused why we have an interface if we have just one implementation (YAGNI!) e.g. OverseerMessageHandler and OverseerMessageHandlerSelector. Similarily, OverseerCollectionProcessor doesn't add much over OverseerProcessor except for the static getOverseerMessageHandlerSelector method. Well there are two OverseerMessageHandlers now . I see below your concern is mainly with the Selector. I was referring to the OverseerMessageHandlerSelector actually. I assumed that you foresee more than one implementation in the future which would make the dispatching more complex, hence the comment. So to dispatch a request, at level 1, you have the OverseerMessageHandlerSelector and at level 2, you have an OverseerMessageHandler and at level 3, you have the processMessage inside the OverseerMessageHandler which sends the request to the right handler method. This is the complexity that I was referring to. Perhaps, we can get rid of OverseerMessageHandlerSelector? The OverseerMessageHandlerSelector was really just some scaffolding to help me with the refactor. I don't have an objection to making a single non-interface implementation of it. Or even a canonical implementation if we adopt the "single this is where I send Overseer operation requests" interface (e.g. it could just do some generic logic mapping the expanded info in the message with the set of available message handlers). That should be another jira though.
          Hide
          Mark Miller added a comment -

          This all looks pretty nice and well tested to me.

          RE the API for accessing the queue, I first leaned towards Shalin's thinking, but Greg has sold me. What queue you get is really an implementation detail. As long as this is commented nicely somewhere, I think it makes sense.

          General Notes:

          I think having separate end points for /collections and /configs and separating the code the best we can as well, rather than dumping it all in one class, makes sense to me.

          Sometimes we use ConfigSet and sometimes ConfigSets? Do we have the right distinction? Can we call it out explicitly somewhere?

          More comments giving overview and rationale of class breakup? As Shalin noted, the class hierarchy is a bit denser to get a handle on.

          Not the biggest fan of adding more tests with easymock - this stuff is such a pain to ramp up on for most committers when refactoring or making changes. I guess what can you do though. Many committers have expressed dislike with having to deal with the collections API test like this, but it’s hard to argue around once someone presents working, useful tests. Some deeper things are just difficult to build simple mocks for.

          OverseerCollectionConfigSetProcessor

          • Typo - * 1) collection-related Overseer messagess
          • Should we just pass the zkclient and simplify this constructor a bit? These boilerplate type class have such long constructors.

          Overseer.getCollectionQueue(zkStateReader.getZkClient(), stats),
          Overseer.getRunningMap(zkStateReader.getZkClient()),å
          Overseer.getCompletedMap(zkStateReader.getZkClient()),
          Overseer.getFailureMap(zkStateReader.getZkClient())

          Tests

          I looked at a bit at what new tests I could easily overwhelm.

          TestConfigSetsAPIExclusivity

          • Beasting Test fails:
             [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestConfigSetsAPIExclusivity -Dtests.method=testAPIExclusivity -Dtests.seed=586D6E5E92126B0D -Dtests.slow=true -Dtests.locale=ar_LB -Dtests.timezone=America/Nassau -Dtests.asserts=true -Dtests.file.encoding=UTF-8
             [junit4] FAILURE  190s | TestConfigSetsAPIExclusivity.testAPIExclusivity <<<
             [junit4]    > Throwable #1: java.lang.AssertionError: Unexpected exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:50330/solr: create the configset time out:180s expected:<0> but was:<1>
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([586D6E5E92126B0D:251877A5CBB6D244]:0)
             [junit4]    > 	at org.apache.solr.cloud.TestConfigSetsAPIExclusivity.testAPIExclusivity(TestConfigSetsAPIExclusivity.java:95)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
             [junit4]   2> 189719 INFO  (SUITE-TestConfigSetsAPIExclusivity-seed#[586D6E5E92126B0D]-worker) [    ] o.a.s.SolrTestCaseJ4 ###deleteCore
          
             [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestConfigSetsAPIExclusivity -Dtests.method=testAPIExclusivity -Dtests.seed=A74936D70D0532AD -Dtests.slow=true -Dtests.locale=ar_MA -Dtests.timezone=America/Resolute -Dtests.asserts=true -Dtests.file.encoding=UTF-8
             [junit4] FAILURE  188s | TestConfigSetsAPIExclusivity.testAPIExclusivity <<<
             [junit4]    > Throwable #1: java.lang.AssertionError: Unexpected exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:40187/solr: delete the configset time out:180s expected:<0> but was:<1>
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([A74936D70D0532AD:DA3C2F2C54A18BE4]:0)
             [junit4]    > 	at org.apache.solr.cloud.TestConfigSetsAPIExclusivity.testAPIExclusivity(TestConfigSetsAPIExclusivity.java:95)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
          

          Everything else looks okay to me.

          Show
          Mark Miller added a comment - This all looks pretty nice and well tested to me. RE the API for accessing the queue, I first leaned towards Shalin's thinking, but Greg has sold me. What queue you get is really an implementation detail. As long as this is commented nicely somewhere, I think it makes sense. General Notes: I think having separate end points for /collections and /configs and separating the code the best we can as well, rather than dumping it all in one class, makes sense to me. Sometimes we use ConfigSet and sometimes ConfigSets? Do we have the right distinction? Can we call it out explicitly somewhere? More comments giving overview and rationale of class breakup? As Shalin noted, the class hierarchy is a bit denser to get a handle on. Not the biggest fan of adding more tests with easymock - this stuff is such a pain to ramp up on for most committers when refactoring or making changes. I guess what can you do though. Many committers have expressed dislike with having to deal with the collections API test like this, but it’s hard to argue around once someone presents working, useful tests. Some deeper things are just difficult to build simple mocks for. OverseerCollectionConfigSetProcessor Typo - * 1) collection-related Overseer messagess Should we just pass the zkclient and simplify this constructor a bit? These boilerplate type class have such long constructors. Overseer.getCollectionQueue(zkStateReader.getZkClient(), stats), Overseer.getRunningMap(zkStateReader.getZkClient()),å Overseer.getCompletedMap(zkStateReader.getZkClient()), Overseer.getFailureMap(zkStateReader.getZkClient()) Tests I looked at a bit at what new tests I could easily overwhelm. TestConfigSetsAPIExclusivity Beasting Test fails: [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestConfigSetsAPIExclusivity -Dtests.method=testAPIExclusivity -Dtests.seed=586D6E5E92126B0D -Dtests.slow=true -Dtests.locale=ar_LB -Dtests.timezone=America/Nassau -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] FAILURE 190s | TestConfigSetsAPIExclusivity.testAPIExclusivity <<< [junit4] > Throwable #1: java.lang.AssertionError: Unexpected exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:50330/solr: create the configset time out:180s expected:<0> but was:<1> [junit4] > at __randomizedtesting.SeedInfo.seed([586D6E5E92126B0D:251877A5CBB6D244]:0) [junit4] > at org.apache.solr.cloud.TestConfigSetsAPIExclusivity.testAPIExclusivity(TestConfigSetsAPIExclusivity.java:95) [junit4] > at java.lang.Thread.run(Thread.java:745) [junit4] 2> 189719 INFO (SUITE-TestConfigSetsAPIExclusivity-seed#[586D6E5E92126B0D]-worker) [ ] o.a.s.SolrTestCaseJ4 ###deleteCore [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestConfigSetsAPIExclusivity -Dtests.method=testAPIExclusivity -Dtests.seed=A74936D70D0532AD -Dtests.slow=true -Dtests.locale=ar_MA -Dtests.timezone=America/Resolute -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] FAILURE 188s | TestConfigSetsAPIExclusivity.testAPIExclusivity <<< [junit4] > Throwable #1: java.lang.AssertionError: Unexpected exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:40187/solr: delete the configset time out:180s expected:<0> but was:<1> [junit4] > at __randomizedtesting.SeedInfo.seed([A74936D70D0532AD:DA3C2F2C54A18BE4]:0) [junit4] > at org.apache.solr.cloud.TestConfigSetsAPIExclusivity.testAPIExclusivity(TestConfigSetsAPIExclusivity.java:95) [junit4] > at java.lang.Thread.run(Thread.java:745) Everything else looks okay to me.
          Hide
          Gregory Chanan added a comment -

          Sometimes we use ConfigSet and sometimes ConfigSets? Do we have the right distinction? Can we call it out explicitly somewhere?

          I tried to follow the Collection(s) example, sometimes it is Collection (e.g. OverseerCollectionProcessor), sometimes it is Collections (CollectionsHandler). I agree that this can be cleaned up, but would rather do it in a future patch.

          More comments giving overview and rationale of class breakup? As Shalin noted, the class hierarchy is a bit denser to get a handle on.

          Good idea, I'll probably just get rid of the Selector interface as Shalin suggested and see where we stand there.

          Not the biggest fan of adding more tests with easymock - this stuff is such a pain to ramp up on for most committers when refactoring or making changes. I guess what can you do though. Many committers have expressed dislike with having to deal with the collections API test like this, but it’s hard to argue around once someone presents working, useful tests. Some deeper things are just difficult to build simple mocks for.

          There's actually no EasyMock additions in this patch – that's just showing up in the diff for some reason because some of the files are renamed. I'll make sure to use "svn move" or whatever is the correct command when I commit.

          Typo - * 1) collection-related Overseer messagess

          Will fix.

          Should we just pass the zkclient and simplify this constructor a bit? These boilerplate type class have such long constructors.

          Overseer.getCollectionQueue(zkStateReader.getZkClient(), stats),
          Overseer.getRunningMap(zkStateReader.getZkClient()),å
          Overseer.getCompletedMap(zkStateReader.getZkClient()),
          Overseer.getFailureMap(zkStateReader.getZkClient())

          That's a good idea, was going to leave for another JIRA but we could do it here.

          Beasting Test fails:

          Not too concerning because they are timeouts, most likely just means we need to scale down the number of iterations. I'll try out the beasting script.

          Show
          Gregory Chanan added a comment - Sometimes we use ConfigSet and sometimes ConfigSets? Do we have the right distinction? Can we call it out explicitly somewhere? I tried to follow the Collection(s) example, sometimes it is Collection (e.g. OverseerCollectionProcessor), sometimes it is Collections (CollectionsHandler). I agree that this can be cleaned up, but would rather do it in a future patch. More comments giving overview and rationale of class breakup? As Shalin noted, the class hierarchy is a bit denser to get a handle on. Good idea, I'll probably just get rid of the Selector interface as Shalin suggested and see where we stand there. Not the biggest fan of adding more tests with easymock - this stuff is such a pain to ramp up on for most committers when refactoring or making changes. I guess what can you do though. Many committers have expressed dislike with having to deal with the collections API test like this, but it’s hard to argue around once someone presents working, useful tests. Some deeper things are just difficult to build simple mocks for. There's actually no EasyMock additions in this patch – that's just showing up in the diff for some reason because some of the files are renamed. I'll make sure to use "svn move" or whatever is the correct command when I commit. Typo - * 1) collection-related Overseer messagess Will fix. Should we just pass the zkclient and simplify this constructor a bit? These boilerplate type class have such long constructors. Overseer.getCollectionQueue(zkStateReader.getZkClient(), stats), Overseer.getRunningMap(zkStateReader.getZkClient()),å Overseer.getCompletedMap(zkStateReader.getZkClient()), Overseer.getFailureMap(zkStateReader.getZkClient()) That's a good idea, was going to leave for another JIRA but we could do it here. Beasting Test fails: Not too concerning because they are timeouts, most likely just means we need to scale down the number of iterations. I'll try out the beasting script.
          Hide
          Yonik Seeley added a comment -

          Not the biggest fan of adding more tests with easymock - this stuff is such a pain to ramp up on for most committers when refactoring or making changes.

          I'll +1 that in general (I know Greg said he's not adding additional easymock tests here)... there are probably niches where it's the best fit, but in the general case it seems prone to over-test, making it difficult to refactor.

          Show
          Yonik Seeley added a comment - Not the biggest fan of adding more tests with easymock - this stuff is such a pain to ramp up on for most committers when refactoring or making changes. I'll +1 that in general (I know Greg said he's not adding additional easymock tests here)... there are probably niches where it's the best fit, but in the general case it seems prone to over-test, making it difficult to refactor.
          Hide
          Gregory Chanan added a comment -

          Should we just pass the zkclient and simplify this constructor a bit? These boilerplate type class have such long constructors.

          Overseer.getCollectionQueue(zkStateReader.getZkClient(), stats),
          Overseer.getRunningMap(zkStateReader.getZkClient()),å
          Overseer.getCompletedMap(zkStateReader.getZkClient()),
          Overseer.getFailureMap(zkStateReader.getZkClient())

          Actually, I think this is a little more complicated. The runningMap/completedMap/failureMap are used by the CollectionsHandler to check for async tasks. If the collection queue is collection specific the maps should be as well. It probably makes sense to just group this stuff together in a single object, like "OverseerTaskQueueState" or "OverseerTaskQueueAsyncState" that just holds these objects. Then there can be just a "getCollectionQueueState" and "getConfigSetQueueState" or whatever. That would simplify the constructor above as well. But I think this should be in another jira.

          Show
          Gregory Chanan added a comment - Should we just pass the zkclient and simplify this constructor a bit? These boilerplate type class have such long constructors. Overseer.getCollectionQueue(zkStateReader.getZkClient(), stats), Overseer.getRunningMap(zkStateReader.getZkClient()),å Overseer.getCompletedMap(zkStateReader.getZkClient()), Overseer.getFailureMap(zkStateReader.getZkClient()) Actually, I think this is a little more complicated. The runningMap/completedMap/failureMap are used by the CollectionsHandler to check for async tasks. If the collection queue is collection specific the maps should be as well. It probably makes sense to just group this stuff together in a single object, like "OverseerTaskQueueState" or "OverseerTaskQueueAsyncState" that just holds these objects. Then there can be just a "getCollectionQueueState" and "getConfigSetQueueState" or whatever. That would simplify the constructor above as well. But I think this should be in another jira.
          Hide
          Gregory Chanan added a comment -

          Here's a new version of the patch. Changes:

          • rebased to lastest trunk
          • Fixed typo pointed out by Mark
          • Added comment to OverseerCollectionMessageHandler
          • Renamed OverseerProcessor -> OverseerTaskProcessor to go along with the OverseerTaskQueue name.

          I plan on committing this soon if I don't hear any objections and will file follow on jiras for the suggestions above.

          Show
          Gregory Chanan added a comment - Here's a new version of the patch. Changes: rebased to lastest trunk Fixed typo pointed out by Mark Added comment to OverseerCollectionMessageHandler Renamed OverseerProcessor -> OverseerTaskProcessor to go along with the OverseerTaskQueue name. I plan on committing this soon if I don't hear any objections and will file follow on jiras for the suggestions above.
          Hide
          Gregory Chanan added a comment -

          Oh and forgot to mention, I was able to reproduce with failures with the beasting script and just turning down the number of iterations allowed it to pass with 100 iterations, 8 concurrent.

          Show
          Gregory Chanan added a comment - Oh and forgot to mention, I was able to reproduce with failures with the beasting script and just turning down the number of iterations allowed it to pass with 100 iterations, 8 concurrent.
          Hide
          ASF subversion and git services added a comment -

          Commit 1698043 from gchanan@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1698043 ]

          SOLR-7789: Introduce a ConfigSet management API

          Show
          ASF subversion and git services added a comment - Commit 1698043 from gchanan@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1698043 ] SOLR-7789 : Introduce a ConfigSet management API
          Hide
          ASF subversion and git services added a comment -

          Commit 1698072 from gchanan@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1698072 ]

          SOLR-7789: fix jira number in CHANGES.txt

          Show
          ASF subversion and git services added a comment - Commit 1698072 from gchanan@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1698072 ] SOLR-7789 : fix jira number in CHANGES.txt
          Hide
          ASF subversion and git services added a comment -

          Commit 1698079 from gchanan@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1698079 ]

          SOLR-7789: Introduce a ConfigSet management API

          Show
          ASF subversion and git services added a comment - Commit 1698079 from gchanan@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1698079 ] SOLR-7789 : Introduce a ConfigSet management API
          Hide
          Gregory Chanan added a comment -

          Thanks for the reviews Mark and Shalin. Committed to 5.4 and trunk.

          I tried to incorporate all the discussion into SOLR-7985; if there's something you think I missed, please feel free to add it there.

          Show
          Gregory Chanan added a comment - Thanks for the reviews Mark and Shalin. Committed to 5.4 and trunk. I tried to incorporate all the discussion into SOLR-7985 ; if there's something you think I missed, please feel free to add it there.
          Hide
          Mark Miller added a comment -

          I just hit one of those fails on a normal 'ant test' run. I think we may have to raise this 180 second timeout to be something that is a little more hearty.

             [junit4] FAILURE  181s J1  | TestConfigSetsAPIExclusivity.testAPIExclusivity <<<
             [junit4]    > Throwable #1: java.lang.AssertionError: Unexpected exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:57409/solr: create the configset time out:180s expected:<0> but was:<2>
             [junit4]    > 	at __randomizedtesting.SeedInfo.seed([36DBA95D885A778C:4BAEB0A6D1FECEC5]:0)
             [junit4]    > 	at org.apache.solr.cloud.TestConfigSetsAPIExclusivity.testAPIExclusivity(TestConfigSetsAPIExclusivity.java:95)
             [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
             [junit4]   2> 361809 INFO  (SUITE-TestConfigSetsAPIExclusivity-seed#[36DBA95D885A778C]-worker) [    ] o.a.s.SolrTestCaseJ4 ###deleteCore
          
          
          
          Show
          Mark Miller added a comment - I just hit one of those fails on a normal 'ant test' run. I think we may have to raise this 180 second timeout to be something that is a little more hearty. [junit4] FAILURE 181s J1 | TestConfigSetsAPIExclusivity.testAPIExclusivity <<< [junit4] > Throwable #1: java.lang.AssertionError: Unexpected exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:57409/solr: create the configset time out:180s expected:<0> but was:<2> [junit4] > at __randomizedtesting.SeedInfo.seed([36DBA95D885A778C:4BAEB0A6D1FECEC5]:0) [junit4] > at org.apache.solr.cloud.TestConfigSetsAPIExclusivity.testAPIExclusivity(TestConfigSetsAPIExclusivity.java:95) [junit4] > at java.lang.Thread.run(Thread.java:745) [junit4] 2> 361809 INFO (SUITE-TestConfigSetsAPIExclusivity-seed#[36DBA95D885A778C]-worker) [ ] o.a.s.SolrTestCaseJ4 ###deleteCore
          Hide
          Gregory Chanan added a comment -

          I'll increase the timeout to 5 minutes and lower the number of trials. If this hits us again I'll change the tests to make timeouts okay if a decent percentage of requests succeed.

          Show
          Gregory Chanan added a comment - I'll increase the timeout to 5 minutes and lower the number of trials. If this hits us again I'll change the tests to make timeouts okay if a decent percentage of requests succeed.
          Hide
          ASF subversion and git services added a comment -

          Commit 1698436 from gchanan@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1698436 ]

          SOLR-7789: Increase ConfigSets API timeout

          Show
          ASF subversion and git services added a comment - Commit 1698436 from gchanan@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1698436 ] SOLR-7789 : Increase ConfigSets API timeout
          Hide
          ASF subversion and git services added a comment -

          Commit 1698437 from gchanan@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1698437 ]

          SOLR-7789: Increase ConfigSets API timeout

          Show
          ASF subversion and git services added a comment - Commit 1698437 from gchanan@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1698437 ] SOLR-7789 : Increase ConfigSets API timeout
          Hide
          Upayavira added a comment -

          Is there a LIST option? I'd love to use this in the admin UI and would rather avoid having to do Zookeeper lookups in order to make it work. All I need (initially) is to be able to get a list of configset names.

          Show
          Upayavira added a comment - Is there a LIST option? I'd love to use this in the admin UI and would rather avoid having to do Zookeeper lookups in order to make it work. All I need (initially) is to be able to get a list of configset names.
          Hide
          Noble Paul added a comment -

          I wish there are a few examples added in the description so that I don't have to go through the tests to figure out how it looks like to a user

          Show
          Noble Paul added a comment - I wish there are a few examples added in the description so that I don't have to go through the tests to figure out how it looks like to a user
          Hide
          Mark Miller added a comment -

          Uh, the first comment is pretty detailed. Stretch that imagination of yours.

          Show
          Mark Miller added a comment - Uh, the first comment is pretty detailed. Stretch that imagination of yours.
          Hide
          Noble Paul added a comment -

          I does not specify what is the end point

          Show
          Noble Paul added a comment - I does not specify what is the end point
          Hide
          Mark Miller added a comment -

          It's /configs. It's mentioned a few times in the comments and is in the patch.

          Show
          Mark Miller added a comment - It's /configs. It's mentioned a few times in the comments and is in the patch.
          Hide
          Noble Paul added a comment -

          It's there in the comments (and the patch) . But there is no harm in putting it right there in the description. The description is editable for a reason

          Show
          Noble Paul added a comment - It's there in the comments (and the patch) . But there is no harm in putting it right there in the description. The description is editable for a reason
          Hide
          Gregory Chanan added a comment -

          Is there a LIST option? I'd love to use this in the admin UI and would rather avoid having to do Zookeeper lookups in order to make it work. All I need (initially) is to be able to get a list of configset names.

          There is not currently, but that's a good idea. I filed SOLR-7995 to add it.

          I wish there are a few examples added in the description so that I don't have to go through the tests to figure out how it looks like to a user

          I'm happy to turn the first comment into a page on the cwiki. I modeled that comment after the https://cwiki.apache.org/confluence/display/solr/Collections+API page so it would be easy to do. Is it okay to add the page now, even though there is no release with this feature yet? That seems like a more fruitful place to have discussions about making it easier to understand for end users.

          Show
          Gregory Chanan added a comment - Is there a LIST option? I'd love to use this in the admin UI and would rather avoid having to do Zookeeper lookups in order to make it work. All I need (initially) is to be able to get a list of configset names. There is not currently, but that's a good idea. I filed SOLR-7995 to add it. I wish there are a few examples added in the description so that I don't have to go through the tests to figure out how it looks like to a user I'm happy to turn the first comment into a page on the cwiki. I modeled that comment after the https://cwiki.apache.org/confluence/display/solr/Collections+API page so it would be easy to do. Is it okay to add the page now, even though there is no release with this feature yet? That seems like a more fruitful place to have discussions about making it easier to understand for end users.
          Hide
          Noble Paul added a comment -

          There is no wrong time to add a wiki doc

          Show
          Noble Paul added a comment - There is no wrong time to add a wiki doc
          Hide
          Upayavira added a comment -

          yes, the wiki is always the source for the next, as yet unreleased, reference guide.

          Show
          Upayavira added a comment - yes, the wiki is always the source for the next, as yet unreleased, reference guide.
          Hide
          Upayavira added a comment -

          and thanks for creating SOLR-7995!

          Show
          Upayavira added a comment - and thanks for creating SOLR-7995 !
          Hide
          Gregory Chanan added a comment -

          FYI I created a cwiki page on the the API here: https://cwiki.apache.org/confluence/display/solr/ConfigSets+API

          Show
          Gregory Chanan added a comment - FYI I created a cwiki page on the the API here: https://cwiki.apache.org/confluence/display/solr/ConfigSets+API

            People

            • Assignee:
              Gregory Chanan
              Reporter:
              Gregory Chanan
            • Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development