Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
9.0
-
None
Description
The documentation is unclear about how auto commits actually work in SolrCloud. A mailing list reply by Erick Erickson proved to be enlightening.
Erick's reply verbatim:
Each node has its own timer that starts when it receives an update.
So in your situation, 60 seconds after any give replica gets it’s first
update, all documents that have been received in the interval will
be committed.But note several things:
1> commits will tend to cluster for a given shard. By that I mean
they’ll tend to happen within a few milliseconds of each other
‘cause it doesn’t take that long for an update to get from the
leader to all the followers.2> this is per replica. So if you host replicas from multiple collections
on some node, their commits have no relation to each other. And
say for some reason you transmit exactly one document that lands
on shard1. Further, say nodeA contains replicas for shard1 and shard2.
Only the replica for shard1 would commit.3> Solr promises eventual consistency. In this case, due to all the
timing variables it is not guaranteed that every replica of a single
shard has the same document available for search at any given time.
Say doc1 hits the leader at time T and a follower at time T+10ms.
Say doc2 hits the leader and gets indexed 5ms before the
commit is triggered, but for some reason it takes 15ms for it to get
to the follower. The leader will be able to search doc2, but the
follower won’t until 60 seconds later.
Perhaps the subject deserves a section of its own, but I'll attach a patch which includes the gist of Erick's reply as a Tip in the "indexing in SolrCloud"-section.
Attachments
Attachments
Issue Links
- links to