it is hard to push too many small partly-done features.
As committers it's the opposite - it's sometimes hard to consume patches that do a lot of different things.
I could imagine. Im sorry that it is possibly going to be a little bit like that this time. But I believe I will provide great stuff.
It works with multi-document update request, providing fully detailed feedback to the client about which document-updates failed (and why) and which succeeded.
That's great! But it's also a separate feature that we've needed for a while (and I think there has been a JIRA issue open for it for a while).
Yes I agree. See SOLR-3382. The solution I will provide is generic in the way that is supports giving feedback to the client about "parts" of a request that failed. The client will know that the "parts" which are not reported to have failed did succeed. Usefull for any kind of request where parts of it can fail and other parts can succeed. The client provide a "partRef" for all parts, and partial errors from the server each contain a "partRef" in order to let the client know which part(s) failed. I only use it for multi-document (or batch-updates as I see you call it in come places) updates (here each document is a "part"), but I guess it could also be used for other things like
- Reporting which operations of a multi-operation (e.g. one containing both adds, deletes, commits, etc) update that failed.
- Reporting if a single update partly failed (e.g. if using replication and the update was successfull on leader but not on one of more of the replicas)
Looking shortly at you patch, I belive, that if two threads in the server JVM overlaps in a way that is unfortunate enough, then it is possible that data will not be stored or will be overwritten without an exception being thrown to indicate that to the client.
I'm confident in the synchronization/concurrency part - it's just reusing what was put in place for SolrCloud to handle reordering of updates to replicas and is very well tested (see TestRealTimeGet). But please let us know if you see a problem with it, as that would mean a problem with SolrCloud today (even without this patch).
Looking closer at your code I would like to withdraw my concern about your solution. I didnt see that you are doing both "get existing version and check against provided version" and "store updated version" inside the same block governed by two sync-locks
- A readlock on a ReadWriteLock which is writelocked during deleteByQueries (your "vinfo.lockForUpdate"), protecting against concurent deletes (by query)
- A lock on the uniqueKey field (or a hash of it in order to be efficient) of the document (your "synchronized (bucket)"), protecting against concurrent updates and deletes (by id)
So basically I have repeated that locking-pattern in DirectUpdateHandler2 where I chose to place the "unique key constaint"- and "versioning"- check. Guess my locking can be removed if it is really necessary to have this kind of locking "this far up in the callstack" (in DistributedUpdateProcessor.versionAdd). For the purpose of "unique key constraint"- and "versioning"-check I believe it is only necessary to have this kind of locking around the actual updates in DirectUpdateHandler2 (when leader and not peer_sync) - and the smaller you can make synchronized blocks the better. But maybe you have reasons (related to SolrCloud) where you need the locking "this far up in the callstack".
But besides that fact that the locking I did in DirectUpdateHandler2 can be removed I really believe that the actual checking of "unique key constraint"- and "version"-violation logically belong in DirectUpdateHandler2, so I guess I will still try to convince you to take that part of my patch in.
Besides that, I believe your solution just implements what I call semantics-mode "classic-consistency-hybrid" on http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics. I believe the "classic" mode and "consistency" mode are very nice to have in order to provide complete backward compabitility (classic mode) and a very strict consistency mode where it is not possible to "break the checks" by providing _version_ = 0 (consistency mode).
Finally I believe my solution is a little better with respect to "design" and "separation of concerns", by isolatating the rules and rule-checks in a separate class instead of just coding it directly inside DistributedUpdateProcessor which already deals with a lot of other concerns, but that is another discussion, and should not count as the deciding argument.
Feedback to client is even possible if using ConcurrentSolrServer, because it has been changed to provide a Future from "request" - a Future that will eventually provide the calling client with the result from the request.
This sounds like another great feature! It really deserves it's own issue so people can more easily provide review/feedback and not have it coupled to this issue.
Created the isolated issue - see SOLR-3383
Some other additions that can be handled later that I see:
SolrJ support for easier passing of version on a delete, constants for version, etc
Well basically I have moved VERSION_FIELD = "_version_" from VersionInfo to SolrInputDocument.
Use of "1" as a generic "exists" version (i.e. update document only if it already exists)
Well I started out by wanting that feature (update only if exists, but no exact version check), but you couldnt see the good reason for that. Since then I realized that you are probably right and have removed that exact possibility - of course in classic and classic-consistency-hybrid modes you can still do updates without providing the exact version, but there is error if the document does not exist.
If one document in a batch fails, don't automatically abort, and provide info back about which docs succeeded and which failed (your first point).
That is part of my solution - basically the thing making it necessary to add the ability to send many errors back to the client - SOLR-3382
That last one in particular needs some good discussion and design work and really deserves it's own issue.
Agree. Go discuss and comment on SOLR-3382
Having this current improvement committed in no way blocks any future improvements you may come up with (including deleting the code quoted above and using whatever method you have come up with for checking the versions), and it even uses the same API (or a subset of it) via version and 409. Progress, not perfection!
Agree, so I guess you could go commit, but that would just make my merge-task a little bigger, so I hope you will hold your horses a few days more.
Are there any technical objections to this patch?
Except for the things I mention here and there in this response, no. And those are all things that could be fixed later, going nicely hand in hand with your great philosophy of "progress, not perfection".
It really piggybacks on all of the SolrCloud work done to pass around and check versions, so it's really small.
Yes it is small and beautiful
The API is as follows:
a) If you provide a version > 0 with an add or a delete, the update will fail with a 409 unless the provided version matches exactly the latest in the index for the ID.
b) If you provide a version < 0 with an add or delete, the update will fail with a 409 if the ID exists in the index.
You always provide the same error (except the versions in the message). You do not distinguish between DocumentAlreadyExist, DocumentDoesNotExist and VersionConflict errors as I do. Of course the client would know that it is a DocumentAlreadyExists if he sent a negative version number and got 409 (at least as long as 409 is not used for other kind of conflicts that can also happen when version < 0) and that it is DocumentDoesNotExist or VersionConflict if he sent a positive version number and got 409 (at least ...). But he really doesnt know if it is DocumentDoesNotExist or VersionConflict unless he parses the error message, and that might be important to know because he might want to react differently - e.g.
- On DocumentDoesNotExist: Either do nothing or go do a create (version < 0) of the document
- On VersionConflict: Go refetch the document from Solr, merge in his changes (automatically or by the help from a user) and try update again.
Of course you also need to parse error-type in my solution to know exactly what the problem is, but you need no "rules" on client side (e.g. knowing that if you sent version > 0 and you got version-already-in-store < 0 back, it is DocumentDoesNotExist), and in SolrJ my solution automatically parses error-type and throw corresponding exceptions.
Ad b) Maybe you dont want an error if you try to delete a document which has already been deleted, but probably ok
That's the whole scope of this patch, and I believe is compatible with the larger scope of what Per will provide in the future.