Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.28.2
-
Observed in an OpenStack environment where each master lives on a separate VM.
-
Mesosphere Sprint 38
-
5
Description
We observed the following situation in a cluster of five masters:
Time | Master 1 | Master 2 | Master 3 | Master 4 | Master 5 |
---|---|---|---|---|---|
0 | Follower | Follower | Follower | Follower | Leader |
1 | Follower | Follower | Follower | Follower | Partitioned from cluster by downing this VM's network |
2 | Elected Leader by ZK | Voting | Voting | Voting | Suicides due to lost leadership |
3 | Performs consensus | Replies to leader | Replies to leader | Replies to leader | Still down |
4 | Performs writing | Acks to leader | Acks to leader | Acks to leader | Still down |
5 | Leader | Follower | Follower | Follower | Still down |
6 | Leader | Follower | Follower | Follower | Comes back up |
7 | Leader | Follower | Follower | Follower | Follower |
8 | Partitioned in the same way as Master 5 | Follower | Follower | Follower | Follower |
9 | Suicides due to lost leadership | Elected Leader by ZK | Follower | Follower | Follower |
10 | Still down | Performs consensus | Replies to leader | Replies to leader | Doesn't get the message! |
11 | Still down | Performs writing | Acks to leader | Acks to leader | Acks to leader |
12 | Still down | Leader | Follower | Follower | Follower |
Master 2 sends a series of messages to the recently-restarted Master 5. The first message is dropped, but subsequent messages are not dropped.
This appears to be due to a stale link between the masters. Before leader election, the replicated log actors create a network watcher, which adds links to masters that join the ZK group:
https://github.com/apache/mesos/blob/7a23d0da817be4e8f68d96f524cecf802431033c/src/log/network.hpp#L157-L159
This link does not appear to break (Master 2 -> 5) when Master 5 goes down, perhaps due to how the network partition was induced (in the hypervisor layer, rather than in the VM itself).
When Master 2 tries to send an PromiseRequest to Master 5, we do not observe the expected log message
Instead, we see a log line in Master 2:
process.cpp:2040] Failed to shutdown socket with fd 27: Transport endpoint is not connected
The broken link is removed by the libprocess socket_manager and the following WriteRequest from Master 2 to Master 5 succeeds via a new socket.
Attachments
Issue Links
- is related to
-
MESOS-5364 Consider adding `unlink` functionality to libprocess
- Resolved
- relates to
-
MESOS-5832 Mesos replicated log corruption with disconnects from ZK
- Resolved
-
MESOS-5740 Consider adding `relink` functionality to libprocess
- Resolved