Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Consider a scenario with 2 cluster nodes (running DocumentNodeStore):
- cluster node A with cluster node id 1
- cluster node B with cluster node id 2
Now cluster node A is doing a merge that includes changes (eg a property) on root. Such a merge includes two updates towards DocumentStore (followed by background stuff) :
- the first update has the actual changes on the properties (that would include any other document other than root - but that's not so relevant here). Say that happens with revision "rn-0-1"
- the second update is marking the revision as committed - which is done by adding an entry to "_revisions" with "rn-0-1" : "c"
- after the merge, the usual backgroundWrite(on A)/backgroundRead(on B) will follow.
At some point cluster node B reads the root node and does a getNodeAtRevision. The behavior slightly differs between the two cases:
- after step 1 it will read "rn-0-1" in an unmerged state (it has no commit value yet)
- after step 2 it will read the revision in a not yet visible state
Either way this revision value resolves to null. As a result of which it greedily reads through previous documents to find a split-away property value only to find nothing. If there are many previous documents, which is likely on root, this is a significant performance hit.
This situation persists until a full backgroundWrite/Read are done, so until step 3 above is done.
Now backgroundRead requires the exclusive lock on the backgroundOperationLock as part of updating a fresh main root. If B happens to have that exclusive lock occupied by anyone else, it has to wait.
As part of a regular merge though, the "read/non-exclusive" lock of backgroundOperationLock is acquired.
If there are a number of threads "ahead" of backgroundRead all acquiring the read-lock of backgroundOperationLock, the backgroundRead will have to wait until all commits are done.
So if B is in such a situation, with all merge operations when updating root going through previous documents, it can result in an overall significant delay.
(This issue can also happen on any other node - but it is only a problem if there are many previous documents. And root usually does have many, hence the problem is primarily on root)