Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
0.28.0, 0.28.1, 1.0.1
-
None
-
None
-
OS X 10.11.6
Description
On a Mesos master failover the reregistered callback of the Java API is not triggered. Only the registration callback is triggered which makes it hard for a framework to distinguish between these scenarios.
This behaviour has been tested with the ConductR framework, both with the Java API version 0.28.0, 0.28.1 and 1.0.1. Below you find the logs from the master that got re-elected and from the ConductR framework.
Log: Mesos master on a master re-election
I0926 11:44:20.008306 3747840 zookeeper.cpp:259] A new leading master (UPID=master@127.0.0.1:5050) is detected I0926 11:44:20.008458 3747840 master.cpp:1847] The newly elected leader is master@127.0.0.1:5050 with id ca5b9713-1eec-43e1-9d27-9ebc5c0f95b1 I0926 11:44:20.008484 3747840 master.cpp:1860] Elected as the leading master! I0926 11:44:20.008498 3747840 master.cpp:1547] Recovering from registrar I0926 11:44:20.008607 3747840 registrar.cpp:332] Recovering registrar I0926 11:44:20.016340 4284416 registrar.cpp:365] Successfully fetched the registry (0B) in 7.702016ms I0926 11:44:20.016393 4284416 registrar.cpp:464] Applied 1 operations in 12us; attempting to update the 'registry' I0926 11:44:20.021428 4284416 registrar.cpp:509] Successfully updated the 'registry' in 5.019904ms I0926 11:44:20.021481 4284416 registrar.cpp:395] Successfully recovered registrar I0926 11:44:20.021611 528384 master.cpp:1655] Recovered 0 agents from the Registry (118B) ; allowing 10mins for agents to re-register I0926 11:44:20.536859 3747840 master.cpp:2424] Received SUBSCRIBE call for framework 'conductr' at scheduler-3f8b9645-7a17-4e9f-8ad5-077fe8c23b39@192.168.2.106:57164 I0926 11:44:20.536969 3747840 master.cpp:2500] Subscribing framework conductr with checkpointing disabled and capabilities [ ] I0926 11:44:20.537401 3211264 hierarchical.cpp:271] Added framework conductr I0926 11:44:20.807895 528384 master.cpp:4787] Re-registering agent b99256c3-6905-44d3-bcc9-0d9e00d20fbe-S0 at slave(1)@127.0.0.1:5051 (127.0.0.1) I0926 11:44:20.808145 1601536 registrar.cpp:464] Applied 1 operations in 38us; attempting to update the 'registry' I0926 11:44:20.815757 1601536 registrar.cpp:509] Successfully updated the 'registry' in 7.568896ms I0926 11:44:20.815992 3747840 master.cpp:7447] Adding task 6abce9bb-895f-4f6f-be5b-25f6bd09f548 with resources mem(*):0 on agent b99256c3-6905-44d3-bcc9-0d9e00d20fbe-S0 (127.0.0.1) I0926 11:44:20.816339 3747840 master.cpp:4872] Re-registered agent b99256c3-6905-44d3-bcc9-0d9e00d20fbe-S0 at slave(1)@127.0.0.1:5051 (127.0.0.1) with cpus(*):8; mem(*):15360; disk(*):470832; ports(*):[31000-32000] I0926 11:44:20.816385 1601536 hierarchical.cpp:478] Added agent b99256c3-6905-44d3-bcc9-0d9e00d20fbe-S0 (127.0.0.1) with cpus(*):8; mem(*):15360; disk(*):470832; ports(*):[31000-32000] (allocated: cpus(*):0.9; mem(*):402.653; disk(*):1000; ports(*):[31000-31000, 31001-31500]) I0926 11:44:20.816437 3747840 master.cpp:4940] Sending updated checkpointed resources to agent b99256c3-6905-44d3-bcc9-0d9e00d20fbe-S0 at slave(1)@127.0.0.1:5051 (127.0.0.1) I0926 11:44:20.816787 4284416 master.cpp:5725] Sending 1 offers to framework conductr (conductr) at scheduler-3f8b9645-7a17-4e9f-8ad5-077fe8c23b39@192.168.2.106:57164
Log: ConductR framework
I0926 11:44:20.007189 66441216 detector.cpp:152] Detected a new leader: (id='87') I0926 11:44:20.007524 64294912 group.cpp:706] Trying to get '/mesos/json.info_0000000087' in ZooKeeper I0926 11:44:20.008625 63758336 zookeeper.cpp:259] A new leading master (UPID=master@127.0.0.1:5050) is detected I0926 11:44:20.008965 63758336 sched.cpp:330] New master detected at master@127.0.0.1:5050 2016-09-26T09:44:20Z MacBook-Pro-6.local INFO MesosSchedulerClient [sourceThread=conductr-akka.actor.default-dispatcher-2, akkaTimestamp=09:44:20.009UTC, akkaSource=akka.tcp://conductr@127.0.0.1:9004/user/reaper/mesos-client-supervisor/singleton/mesos-client, sourceActorSystem=conductr] - Mesos master has been disconnected.. I0926 11:44:20.012472 63758336 sched.cpp:341] No credentials provided. Attempting to register without authentication I0926 11:44:20.537613 65904640 sched.cpp:743] Framework registered with conductr 2016-09-26T09:44:20Z MacBook-Pro-6.local INFO MesosSchedulerClient [sourceThread=conductr-akka.actor.default-dispatcher-18, akkaTimestamp=09:44:20.538UTC, akkaSource=akka.tcp://conductr@127.0.0.1:9004/user/reaper/mesos-client-supervisor/singleton/mesos-client, sourceActorSystem=conductr] - Mesos master on localhost:5050 has been registered with ConductR framework id: conductr
Attachments
Issue Links
- relates to
-
MESOS-786 Update semantics of when framework registered()/reregistered() get called
- Resolved