Description
As part of OAK-1429 a number of improvements were implemented but one issue remains when a node state diff is done with older revisions.
The DocumentNodeStore keeps a modified timestamp on each document and updates it whenever the document is explicitly modified or implicitly when a descendant document is updated. With this timestamp the store is able to tell when a subtree was last modified. The diff implementation gets inefficient when the two revisions to compare are older than the modified timestamp of a document tree. In this case the implementation tends to read many more nodes than were actually modified because it cannot exactly tell when a subtree was modified.
Improvements from OAK-1394 and OAK-1429 helped quite a bit because the diff cache in the DocumentNodeStore is pro-actively filled by the commits. However, in addition to the observation listeners that perform diffs there is also the async index update, which periodically performs a diff. Those diff usually go further back in time and are the ones that are inefficient and also have a negative impact on the diff cache.
A solution to this problem was already discussed in a recent oak conf call. The DocumentNodeStore keeps a journal of commits and uses it to answer node state diff calls. With this journal the store should also be able to efficiently diff across multiple commits. A number of options were discusses, whether to implemented the journal with a local file or a capped MongoDB collection.
Ideas for alternative solutions are welcome...
Attachments
Issue Links
- relates to
-
OAK-1429 Slow event listeners do not scale as expected
- Closed