[OAK-5446] leaseUpdateThread might be blocked by leaseUpdateCheck Created: 12/Jan/17 Updated: 26/Jan/17 |
|
| Status: | Open |
| Project: | Jackrabbit Oak |
| Component/s: | core |
| Affects Version/s: | 1.4, 1.5.14 |
| Fix Version/s: | 1.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | Stefan Eissing | Assignee: | Julian Reschke |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | candidate_oak_1_4 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Fighting with cluster nodes losing their lease and shutting down oak-core in a cloud environment. For reasons unknown at this point in time, the whole process seems to skip about two minutes of real time. This is a situation from which oak currently does not recover. Code analysis shows that ClusterNodeInfo is handed the LeaseCheckDocumentStoreWrapper instance to use as store. This is fatal since any action the renewLease() tries to do will first invoke the performLeaseCheck(). The lease check will, when the FailureMargin is reached, stall the renewLease() thread for 5 retry attempts and then declare the lease to be lost. The ClusterNodeInfo should instead be using the "real" DocumentStore, not the wrapped one, IMO. |
| Comments |
| Comment by Julian Reschke [ 12/Jan/17 ] |
|
Would be good to have a unit test (looking now). FWIW, another way to avoid this would be to special case the CLUSTER collection in the LeaseCheckDocumentStoreWrapper. |
| Comment by Marcel Reutegger [ 16/Jan/17 ] |
|
I agree, there should be a test. Looks like |
| Comment by Julian Reschke [ 16/Jan/17 ] |
|
I'll try to come up with a test case. |
| Comment by Julian Reschke [ 16/Jan/17 ] |
|
Hmmm. public DocumentNodeStore(DocumentMK.Builder builder) {
this.blobStore = builder.getBlobStore();
this.statisticsProvider = builder.getStatisticsProvider();
this.nodeStoreStatsCollector = builder.getNodeStoreStatsCollector();
if (builder.isUseSimpleRevision()) {
this.simpleRevisionCounter = new AtomicInteger(0);
}
DocumentStore s = builder.getDocumentStore();
if (builder.getTiming()) {
s = new TimingDocumentStoreWrapper(s);
}
if (builder.getLogging()) {
s = new LoggingDocumentStoreWrapper(s);
}
if (builder.getReadOnlyMode()) {
s = ReadOnlyDocumentStoreWrapperFactory.getInstance(s);
readOnlyMode = true;
} else {
readOnlyMode = false;
}
this.executor = builder.getExecutor();
this.clock = builder.getClock();
int cid = builder.getClusterId();
cid = Integer.getInteger("oak.documentMK.clusterId", cid);
if (readOnlyMode) {
clusterNodeInfo = ClusterNodeInfo.getReadOnlyInstance(s);
} else {
clusterNodeInfo = ClusterNodeInfo.getInstance(s, cid);
}
// TODO we should ensure revisions generated from now on
// are never "older" than revisions already in the repository for
// this cluster id
cid = clusterNodeInfo.getId();
if (builder.getLeaseCheck()) {
s = new LeaseCheckDocumentStoreWrapper(s, clusterNodeInfo);
clusterNodeInfo.setLeaseFailureHandler(builder.getLeaseFailureHandler());
} else {
clusterNodeInfo.setLeaseCheckDisabled(true);
}
So the DocumentStore is passed into ClusterNodeInfo before it get's wrapped into a LeaseCheckDocumentStoreWrapper, no? |
| Comment by Julian Reschke [ 16/Jan/17 ] |
|
But then, the lease update thread also does: // then, independently if the lease had to be updated or not, check
// the status:
if (nodeStore.updateClusterState()) {
// then inform the discovery lite listener - if it is registered
nodeStore.signalClusterStateChange();
}
and, AFAICT, that indeed would go through the lease checking wrapper... It's not hard to fix this, but it's not entirely clear what the best way is...
Also: is it really intentional that updateClusterState reads all cluster node infos once per second??? Stefan Egli Marcel Reutegger - feedback appreciated. |
| Comment by Julian Reschke [ 16/Jan/17 ] |
|
Proposed patch. |
| Comment by Stefan Egli [ 25/Jan/17 ] |
I second Julian Reschke's finding that ClusterNodeInfo gets the 'real' DocumentStore, not the lease wrapper. Stefan Eissing, do you disagree?
good point, that one goes via the LeaseCheckDocumentStoreWrapper, while it probably should not, agreed.
I would try to come up with the least intrusive change. Which would mean to leave the lease wrapper as is (ie no whitelist or bypassing special cases). Ideally updateClusterState would be aware of the lease wrapper and pass the underlying (real) documentStore instead. Perhaps the DocumentNodeStore could keep track of that real documentStore for use in updateClusterState?
Yes, that is intentional. It is part of discovery lite where each instance checks the state of all the cluster node infos once a second to notice a crashing instance asap. We can discuss if the 1/second resolution is too much or if it should be done in a separate thread, but the discovery lite mechanism is based on this. |
| Comment by Stefan Eissing [ 25/Jan/17 ] |
No, I agree. I saw in the logs that the lease update thread used the LeaseWrapper and was blocked. But the call is in the updateClusterState() that caused this. So, we are all in agree and that this should be changed. As to how, I have no strong opinion. Perhaps the whole updateClusterState should go into the ClusterNodeInfo? |
| Comment by Julian Reschke [ 25/Jan/17 ] |
|
Reminder: there's a patch available. The IMHO most important part is to separate the two operations into separate background threads. |
| Comment by Stefan Egli [ 25/Jan/17 ] |
|
+1 for the patch. Separating the cluster update from the lease update - even though these two operation are about a related concern - is a good thing. (sorry, missed reviewing the patch previously) |
| Comment by Julian Reschke [ 25/Jan/17 ] |
|
Stefan Egli & Marcel Reutegger and we are ok with this to be done before we branch 1.6, right? |
| Comment by Stefan Egli [ 25/Jan/17 ] |
it is a change in a quite central part, so the question is if this is
indeed a blocker for 1.6 or if it can't wait... We'd have to |
| Comment by Stefan Egli [ 25/Jan/17 ] |
|
Added a simplistic unit test that covers the retry loop (OAK-5446.testcase |
| Comment by Julian Reschke [ 25/Jan/17 ] |
|
Verified that the test tests the right thing by setting MAX_RETRY_SLEEPS_BEFORE_LEASE_FAILURE = 0 in ClusterNodeInfo (in which case the test fails). Proposal: add the test as part of We still don't have a test that simulates the issue described in this ticket, though. |
| Comment by Julian Reschke [ 25/Jan/17 ] |
|
Modified test that delays the read from clusterNodes and indeed reproduces the issue. |
| Comment by Stefan Egli [ 25/Jan/17 ] |
|
ack, I'll add the test case to |
| Comment by Stefan Egli [ 25/Jan/17 ] |
even better |
| Comment by Stefan Egli [ 25/Jan/17 ] |
|
just one comment: maybe we should have two flavours of the test: one with the delay and one without - as both cases seem useful. |
| Comment by Julian Reschke [ 25/Jan/17 ] |
|
While the test currently proves that there is a problem, I'm not totally sure that the Thread.sleep is correct here – shouldn't I use the VirtualClock? |
| Comment by Stefan Egli [ 25/Jan/17 ] |
|
I think actually delaying is correct, as a) the
background thread doesn't use the virtual clock anyway and b) you want
to simulate a 'vm freeze' for which you need to actually sleep. |
| Comment by Julian Reschke [ 26/Jan/17 ] |
|
...guess what: in the proposed patch I copied (instead of moved) the cluster update check... |
| Comment by Stefan Egli [ 26/Jan/17 ] |
|
oups, yes, that would not have helped then.. |
| Comment by Julian Reschke [ 26/Jan/17 ] |
|
See r1780424 - for some I reason I currently can not resolve this ticket though. |
| Comment by Julian Reschke [ 26/Jan/17 ] |
|
Note I had to clone the issue to work around a JIRA problem. |