Solr
  1. Solr
  2. SOLR-3425

CloudSolrServer can't create cores when using the zkHost based constructor

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      When programmatically creating cores with a running SolrCloud instance the CloudSolrServer uses the slices nodes information to feed the underlying LBHttpSolrServer so it fails to create cores as there aren't any slices for any new collection (as it's still to be created).
      This happens when using the CloudSolrServer constructor which takes the ZK host as only parameter while it can be avoided by using the constructor which also takes the list of Solr URLs and the underlying LBHttpSolrServer is actually used for making the core creation request.
      However it'd be good to use the ZK host live nodes information to automatically issue a core creation command on one of the underlying Solr hosts without having to specify the full list of URLs beforehand.

      The scenario is when one wants to create a collection with N shards so the client sends N core creation requests for the same collection thus the SolrCloud stuff should just take care of choosing the host where to issue the core creation request and update the cluster state.

      1. SOLR-3425-test.patch
        3 kB
        Tommaso Teofili

        Issue Links

          Activity

          Steve Rowe made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Tommaso Teofili made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Tommaso Teofili added a comment -

          this issue was opened before the Collections API was designed, so basically things have changed along time but apart from that as long as there's SOLR-4140 the use case I opened this for is safe, therefore I'll close it.
          Thanks Per Steffensen for helping out.

          Show
          Tommaso Teofili added a comment - this issue was opened before the Collections API was designed, so basically things have changed along time but apart from that as long as there's SOLR-4140 the use case I opened this for is safe, therefore I'll close it. Thanks Per Steffensen for helping out.
          Per Steffensen made changes -
          Link This issue relates to SOLR-4140 [ SOLR-4140 ]
          Hide
          Per Steffensen added a comment - - edited

          You talk about using CloudSolrServer and/or LBHttpSolrServer for access to Core Admin API. This a little strange because:

          • A Core Admin API request should be sent to a specific Solr node in order for you to control where the shard is created (or removed from)
          • CloudSolrServer and LBHttpSolrServer are all about sending requests that can end up going to a (randmon) node among several nodes
            With the quick-fix code above (Tommaso Teofili 01/May/12) you will end up having your shard created on a random node among all live nodes - it is very rarely want you want.

          So as long as we are talking about accessing Core Admin API you probably always want to use HttpSolrServer, which is aimed a sending the request to a specific node.

          But when talking about creating an entire collection consisting of many shards, it is certainly meaningful to use CloudSolrServer. To create entire collections (without having to create each shard yourself using the Core Admin API) we now have the Collection API in 4.0.0. The Collection API can be used through CloudSolrServer, except...

          • You cant create your first collection through CloudSolrServer
          • You cant create a collection through a CloudSolrServer, where default-collection is the collection you want to create
          • etc

          ...basically because CloudSolrServer wants an existing collection (pointed to by its default-collection or a collection-name provided in the actual request) before it can do anything.
          This will be fixed with SOLR-4140, but is not in 4.0.0.

          Other things not in Collection API of 4.0.0

          • You cant have more than one shard per collection on a single node - fixed in SOLR-4114
          • You cant specify which Solr nodes the shards for a new collection is allowed to be spread across - they are just spread across all live Solr nodes - fixed in SOLR-4120

          Please state if you still believe something is missing or unclear. Or else you might want to state that your "problems" are solved with the 4.0.0 Collection API (maybe plus one or more of SOLR-4140, SOLR-4114 and SOLR-4120 which will hopefully be in "the next release") by closing this issue SOLR-3425

          Show
          Per Steffensen added a comment - - edited You talk about using CloudSolrServer and/or LBHttpSolrServer for access to Core Admin API. This a little strange because: A Core Admin API request should be sent to a specific Solr node in order for you to control where the shard is created (or removed from) CloudSolrServer and LBHttpSolrServer are all about sending requests that can end up going to a (randmon) node among several nodes With the quick-fix code above (Tommaso Teofili 01/May/12) you will end up having your shard created on a random node among all live nodes - it is very rarely want you want. So as long as we are talking about accessing Core Admin API you probably always want to use HttpSolrServer, which is aimed a sending the request to a specific node. But when talking about creating an entire collection consisting of many shards, it is certainly meaningful to use CloudSolrServer. To create entire collections (without having to create each shard yourself using the Core Admin API) we now have the Collection API in 4.0.0. The Collection API can be used through CloudSolrServer, except... You cant create your first collection through CloudSolrServer You cant create a collection through a CloudSolrServer, where default-collection is the collection you want to create etc ...basically because CloudSolrServer wants an existing collection (pointed to by its default-collection or a collection-name provided in the actual request) before it can do anything. This will be fixed with SOLR-4140 , but is not in 4.0.0. Other things not in Collection API of 4.0.0 You cant have more than one shard per collection on a single node - fixed in SOLR-4114 You cant specify which Solr nodes the shards for a new collection is allowed to be spread across - they are just spread across all live Solr nodes - fixed in SOLR-4120 Please state if you still believe something is missing or unclear. Or else you might want to state that your "problems" are solved with the 4.0.0 Collection API (maybe plus one or more of SOLR-4140 , SOLR-4114 and SOLR-4120 which will hopefully be in "the next release") by closing this issue SOLR-3425
          Robert Muir made changes -
          Fix Version/s 4.1 [ 12321141 ]
          Fix Version/s 5.0 [ 12321664 ]
          Fix Version/s 4.0 [ 12322551 ]
          Hide
          Tommaso Teofili added a comment -

          I had a simple patch for making this work, however I'm not too sure what we need to do here (it made sense before but with the new Collections API it may be not that urgent).
          Maybe we can defer this and get advantage of SOLR-3488 as soon as that's finished.

          Show
          Tommaso Teofili added a comment - I had a simple patch for making this work, however I'm not too sure what we need to do here (it made sense before but with the new Collections API it may be not that urgent). Maybe we can defer this and get advantage of SOLR-3488 as soon as that's finished.
          Hide
          Robert Muir added a comment -

          Mark/Tommaso, can you guys look at this issue? I notice it hasn't been touched in months. is it going to make 4.0?

          Show
          Robert Muir added a comment - Mark/Tommaso, can you guys look at this issue? I notice it hasn't been touched in months. is it going to make 4.0?
          Mark Miller made changes -
          Fix Version/s 5.0 [ 12321664 ]
          Mark Miller made changes -
          Assignee Mark Miller [ markrmiller@gmail.com ]
          Robert Muir made changes -
          Fix Version/s 4.0 [ 12322551 ]
          Fix Version/s 4.0-BETA [ 12322455 ]
          Hide
          Robert Muir added a comment -

          rmuir20120906-bulk-40-change

          Show
          Robert Muir added a comment - rmuir20120906-bulk-40-change
          Hoss Man made changes -
          Fix Version/s 4.0 [ 12322455 ]
          Fix Version/s 4.0-ALPHA [ 12314992 ]
          Hide
          Hoss Man added a comment -

          bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

          Show
          Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
          Hide
          Tommaso Teofili added a comment -

          Mark, I'd commit this quick fix for now so that we solve the bug and maybe we can start discussing about a new collection management API on a different issue.

          Show
          Tommaso Teofili added a comment - Mark, I'd commit this quick fix for now so that we solve the bug and maybe we can start discussing about a new collection management API on a different issue.
          Hide
          Tommaso Teofili added a comment -

          Perhaps we should just add new collection management api?

          I think so, that would also help SolrCloud users understand the mind shift from cores to collections.

          For this particular thing the dummy fix (not tested widely but still all the tests pass) could be to add the following lines for filling the urlList variable:

              // enable automatic distributed core creation
              if (request instanceof CoreAdminRequest.Create) {
                for (String liveNodeHost : zkStateReader.getCloudState().getLiveNodes()) {
                  urlList.add(new StringBuilder("http://").append(liveNodeHost.replaceAll("_solr", "/solr/")).toString());
                }
              }
              else {
                for (Slice slice : slices.values()) {
              ...
          

          however i don't like it very much. I think adding proper APIs would be just better.

          Show
          Tommaso Teofili added a comment - Perhaps we should just add new collection management api? I think so, that would also help SolrCloud users understand the mind shift from cores to collections. For this particular thing the dummy fix (not tested widely but still all the tests pass) could be to add the following lines for filling the urlList variable: // enable automatic distributed core creation if (request instanceof CoreAdminRequest.Create) { for ( String liveNodeHost : zkStateReader.getCloudState().getLiveNodes()) { urlList.add( new StringBuilder( "http: //" ).append(liveNodeHost.replaceAll( "_solr" , "/solr/" )).toString()); } } else { for (Slice slice : slices.values()) { ... however i don't like it very much. I think adding proper APIs would be just better.
          Hide
          Mark Miller added a comment -

          I'd like to improve collection creation outside of Solrj as well - but I think it also makes sense to improve this here as well. Do you have a proposal yet? Perhaps we should just add new collection management api? Trying to wrap this stuff in a SolrCore's world gets kind of ugly.

          Show
          Mark Miller added a comment - I'd like to improve collection creation outside of Solrj as well - but I think it also makes sense to improve this here as well. Do you have a proposal yet? Perhaps we should just add new collection management api? Trying to wrap this stuff in a SolrCore's world gets kind of ugly.
          Tommaso Teofili made changes -
          Field Original Value New Value
          Attachment SOLR-3425-test.patch [ 12525167 ]
          Hide
          Tommaso Teofili added a comment -

          test case: setup a 2 nodes cluster as example A in SolrCloud wiki page (http://wiki.apache.org/solr/SolrCloud) and run the attached test.
          The testLBServerCoreCreation() test should pass while the testZKHostCoreCreation() should fail

          Show
          Tommaso Teofili added a comment - test case: setup a 2 nodes cluster as example A in SolrCloud wiki page ( http://wiki.apache.org/solr/SolrCloud ) and run the attached test. The testLBServerCoreCreation() test should pass while the testZKHostCoreCreation() should fail
          Tommaso Teofili created issue -

            People

            • Assignee:
              Mark Miller
              Reporter:
              Tommaso Teofili
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development