Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
As discussed in OAK-2683 "hitting the observation queue limit" has multiple drawbacks. Quite a bit of work is done to make diff generation faster. However there are still chances of event queue getting filled up.
This issue is meant to implement a persistent event journal. Idea here being
- NodeStore would push the diff into a persistent store via a synchronous observer
- Observors which are meant to handle such events in async way (by virtue of being wrapped in BackgroundObserver) would instead pull the events from this persisted journal
A - What is persisted
1 - Serialized Root States and CommitInfo
In this approach we just persist the root states in serialized form.
- DocumentNodeStore - This means storing the root revision vector
- SegmentNodeStore - Q1 - What does serialized form of SegmentNodeStore root state looks like - Possible the RecordId of "root" state
Note that with OAK-4528 DocumentNodeStore can rely on persisted remote journal to determine the affected paths. Which reduces the need for persisting complete diff locally.
Event generation logic would then "deserialize" the persisted root states and then generate the diff as currently done via NodeState comparison
2 - Serialized commit diff and CommitInfo
In this approach we can save the diff in JSOP form. The diff only contains information about affected path. Similar to what is current being stored in DocumentNodeStore journal
CommitInfo
The commit info would also need to be serialized. So it needs to be ensure whatever is stored there can be serialized or re calculated
B - How it is persisted
1 - Use a secondary segment NodeStore
OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. mreutegg suggested that for persisted local journal we can also utilize a SegmentNodeStore instance. Care needs to be taken for compaction. Either via generation approach or relying on online compaction
2- Make use of write ahead log implementations
ianeboston suggested that we can make use of some write ahead log implementation like [1], [2] or [3]
C - How changes get pulled
Some points to consider for event generation logic
- Would need a way to keep pointers to journal entry on per listener basis. This would allow each Listener to "pull" content changes and generate diff as per its speed and keeping in memory overhead low
- The journal should survive restarts
[1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
[2] https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
[3] https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog
Attachments
Attachments
Issue Links
- is blocked by
-
OAK-4655 Enable configuring multiple segment nodestore instances in same setup
- Closed
- is related to
-
OAK-4853 User defined property values in event info
- Resolved
-
OAK-4655 Enable configuring multiple segment nodestore instances in same setup
- Closed
-
OAK-2683 the "hitting the observation queue limit" problem
- Open
-
SLING-6070 Reduce temporary storage required in JcrResourceListener, call reportChanges earlier
- Closed
- relates to
-
SLING-6056 achieve 1:1 mapping between observation and resource change listener
- Closed
- supercedes
-
OAK-1368 Only one Observer per session
- Resolved