Thanks for taking a look Varun!
Can we move the test to not use AbstractFullDistribZkTestBase and use SolrCloudTestCase instead? It should make the test a lot faster in my experience. The current test on my machine took 55s .
In testCreateCollectionCleanup() after we confirm that the collection doesn't exist via the LIST api call, can we try to create a collection without the bogus dataDir param to make sure it gets created successfully.
There are other tests that check for exactly that, I think we can skip that and save time here.
Maybe we can randomize what param we pass wrongly? We can try with a wrong configName , try with legacyCloud=false etc. It might trigger different artifacts which get left behind to give better test coverage?
wrong configName etc, fail fast i.e. without creating any core. If they don't they should. There are wrong config name issues that we could simulate e.g. create a collection, and during the process, force remove the config from zk so that a core creation fails but I think that isn't something that we should include with this issue.
A separate test which uses more than one replica so that we know all the cores were cleaned up properly?
Yes, but I think the deleteCollection would already be testing the same. We just reuse the code.
In OverseerCollectionMessageHandler there is one INFO and one WARN statement with a similar message. Maybe we should just put across a message like "failed to create collection. Cleaning up artifacts .." ?
warn ? Can you point me to it? Here's what I see:
log.error("Cleaning up collection [" + collectionName + "]." ); and
log.info("Cleaned up collection [" + collectionName + "].");
log.debug("Finished create command on all shards for collection: "
The cleanup delete operation does not pass along the async param which might have been used.
I thought about that but decided against passing that param. Is there a reason why you would want to pass that along? We aren't passing/appending the core responses and if the request was async, the user already received a response about request submission.
Slightly unrelated but in createCollection instead of catching Exception, catching IOException should be enough and we don't even need to catch SolrException ?
+1, but I didn't want to include it here. I ideally wanted to fail fast and throw an exception from here (see my previous patches) but that would change the behavior for end users so decided against it for now.