Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.2
Description
Currently, in an oak-cluster when (e.g.) one oak-client stops renewing its lease (ClusterNodeInfo.renewLease()), this will be eventually noticed by the others in the same oak-cluster. Those then mark this client as inactive and start recoverying and subsequently removing that node from any further merge etc operation.
Now, whatever the reason was why that client stopped renewing the lease (could be an exception, deadlock, whatever) - that client itself still considers itself as active and continues to take part in the cluster action.
This will result in a unbalanced situation where that one client 'sees' everybody as active while the others see this one as inactive.
If this ClusterNodeInfo state should be something that can be built upon, and to avoid any inconsistency due to unbalanced handling, the inactive node should probably retire gracefully - or any other appropriate action should be taken, other than just continuing as today.
This ticket is to keep track of ideas and actions taken wrt this.
Attachments
Attachments
Issue Links
- blocks
-
OAK-2844 Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)
- Closed
-
SLING-4603 discovery.oak: oak-based discovery implementation
- Closed
- is blocked by
-
OAK-2681 Update lease without holding lock
- Closed
-
OAK-2682 Introduce time difference detection for DocumentNodeStore
- Closed
- is related to
-
OAK-3390 Avoid instanceof check in DocumentNodeStore
- Closed
-
OAK-3236 integration test that simulates influence of clock drift
- Open
- relates to
-
OAK-3238 fine tune clock-sync check vs lease-check settings
- Closed
-
OAK-3250 Restart DocumentNodeStore on lease timeout instead of bundle.stop (was: System.exit)
- Open