Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.4.0
-
None
-
Important
Description
proposed fix: https://github.com/apache/mesos/pull/248
I observed this in my environment, where I had two frameworks that used the same ExecutorID and then triggered a master failover. The master refuses to reregister the slave because it's not considering the owning-framework of the ExecutorID when computing ExecutorID uniqueness, and concludes (incorrectly) that there's an erroneous duplicate executor ID:
W1103 00:33:42.509891 19638 master.cpp:6008] Dropping re-registration of agent at slave(1)@10.2.0.7:5051 because it sent an invalid re-registration: Executor has a duplicate ExecutorID 'default'
(yes, "default" is probably a terrible name for an ExecutorID - that's a separate discussion!)
/cc neilc
Attachments
Issue Links
- links to