[MESOS-7389] Mesos 1.2.0 crashes with pre-1.0 Mesos agents. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.2.0
Fix Version/s: 1.2.1, 1.3.0, 1.4.0
Component/s: None
Labels:
- mesosphere
Environment:

Ubuntu 14.04

Target Version/s:

1.2.1, 1.3.0, 1.4.0

Description

During upgrade from 1.0.1 to 1.2.0 a single mesos-slave reregistering with the running leader caused the leader to terminate. All 3 of the masters suffered the same failure as the same slave node reregistered against the new leader, this continued across the entire cluster until the offending slave node was removed and fixed. The fix to the slave node was to remove the mesos directory and then start the slave node back up.

F0412 17:24:42.736600 6317 master.cpp:5701] Check failed: frameworks_.contains(task.framework_id())

- - Check failure stack trace: ***
    @ 0x7f59f944f94d google::LogMessage::Fail()
    @ 0x7f59f945177d google::LogMessage::SendToLog()
    @ 0x7f59f944f53c google::LogMessage::Flush()
    @ 0x7f59f9452079 google::LogMessageFatal::~LogMessageFatal()
    I0412 17:24:42.750300 6316 replica.cpp:693] Replica received learned notice for position 6896 from @0.0.0.0:0
    @ 0x7f59f88f2341 mesos::internal::master::Master::_reregisterSlave()
    @ 0x7f59f88f488f ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal6master6MasterERKNS5_9SlaveInfoERKNS0_4UPIDERKSt6vectorINS5_8ResourceESaISG_EERKSF_INS5_12ExecutorInfoESaISL_EERKSF_INS5_4TaskESaISQ_EERKSF_INS5_13FrameworkInfoESaISV_EERKSF_INS6_17Archive_FrameworkESaIS10_EERKSsRKSF_INS5_20SlaveInfo_CapabilityESaIS17_EERKNS0_6FutureIbEES9_SC_SI_SN_SS_SX_S12_SsS19_S1D_EEvRKNS0_3PIDIT_EEMS1H_FvT0_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_T19_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2
    @ 0x7f59f93c3eb1 process::ProcessManager::resume()
    @ 0x7f59f93ccd57 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
    @ 0x7f59f77cfa60 (unknown)
    @ 0x7f59f6fec184 start_thread
    @ 0x7f59f6d19bed (unknown)

Attachments

Issue Links

is related to

MESOS-1987 Add support for SemVer build and prerelease labels to stout.

Resolved

MESOS-6975 Prevent pre-1.0 agents from registering with 1.3+ master.

Resolved

Activity

People

Assignee:: Neil Conway

Reporter:: Nicholas Studt

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 13/Apr/17 16:33

Updated:: 28/Jul/17 23:54

Resolved:: 05/May/17 23:52