Optimistic locking as a superset to insert/update:
What I already had in mind:
- update only a specific version of the document by specifying it's exact version: version=12345
- add a document only if it doesn't already exist (i.e. insert): version=-1
- add a document regardless: don't specify a version
I still need a little time to evaluate to what extend version can be used.
So now that I look at it again, it looks like what's missing is your "UPDATE" semantics which would only replace the record if it already existed (a weaker form of the first case... any positive version is OK). But I really wonder how useful those semantics are (only add a doc if it's overwriting an existing doc, regardless of what version or what data it contains?)
If there are usecases, we certainly should be able to do it.
The only-insert-if-not-exists is needed by us. The only-update-if-exists is mostly for consistency with what we know from RDBMS. Basically simulating what happens when you do the following in SQL and you have unique-constraint on id column. 1) will fail with a unique-key constraint error if document already exists and 2) will not create the row/doc if it does not already exist.
1) INSERT INTO docs (id, column2, column3,...) VALUES (id-value, value2, value3,...)
2) UPDATE docs SET column2=value2, column3=value3, ... WHERE id=id-value
RDBMS people are used to a update operation that does no create a row/document if it has already been deleted. I will consider not making that feature - it is only there to give a consistent experince compared to what you are used to using RDBMS's, and actually seen from a distant perspective I think it is not logical with an "update"-operation that creates stuff if it does not exist (it is simple not logical from the word "update")
Right now I believe the solution will be that you will have the following URL-extentions
a) .../solr/.../update, the one already existing in Solr with unchanged semantics
b) .../solr/.../database/update, that updates if document already exists and does nothing if it does not already exists. And when versioning is activated (SOLR-3178) only updates if correct version is given - give VersionConflict error if document exists but version is not correct.
c) .../solr/.../database/insert, that creates a new document if document does not already exist. Fails with DocumentAlreadyExists error if document already exists.
The you can keep using Solr exactly as you are used to, and you can start using the new "database semantics" features if you want that. I might create a optinal config for DirectUpdateHandler2 where you can deactivate the stuff behind a). This can be used when you dont trust clients to use a) correctly in a setup where you want to ensure consistency under high concurrent load.
As far as what _version_ is, it's new and used for solrcloud to handle reorders of updates to replicas (among other things).
The leader shard decides what the version of a document should be (versions only increase), and forwards the doc with the version to the replicas.
If a replica receives the same doc with a lower version, it knows that it can safely drop it because it already has a newer version.
Cool. I understand a little better now. So no (Wiki) documentation written yet?