Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
0.27.0
-
None
-
Mesosphere Sprint 30
Description
Showed up on ASF CI for MasterMaintenanceTest.PendingUnavailabilityTest
I0229 11:08:57.027559 668 hierarchical.cpp:1437] No resources available to allocate! I0229 11:08:57.027745 668 hierarchical.cpp:1150] Performed allocation for slave fd39ca89-d7fd-4df8-ad50-dbb493d1cd7b-S0 in 272747ns I0229 11:08:57.027757 675 master.cpp:5369] Sending 1 offers to framework fd39ca89-d7fd-4df8-ad50-dbb493d1cd7b-0000 (default) I0229 11:08:57.028586 675 master.cpp:5459] Sending 1 inverse offers to framework fd39ca89-d7fd-4df8-ad50-dbb493d1cd7b-0000 (default) I0229 11:08:57.029039 675 master.cpp:5459] Sending 1 inverse offers to framework fd39ca89-d7fd-4df8-ad50-dbb493d1cd7b-0000 (default)
The ideal expected workflow for this test is something like:
- The framework receives offers from master.
- The framework updates its maintenance schedule.
- The current offer is rescinded.
- A new offer is received from the master with unavailability set.
- After the agent goes for maintenance, an inverse offer is sent.
For some reason, in the logs we see that the master is sending 2 inverse offers. The test seems to pass as we just check for the initial inverse offer being present. This can also be reproduced by a modified version of the original test.
// Test ensures that an offer will have an `unavailability` set if the // slave is scheduled to go down for maintenance. TEST_F(MasterMaintenanceTest, PendingUnavailabilityTest) { Try<PID<Master>> master = StartMaster(); ASSERT_SOME(master); MockExecutor exec(DEFAULT_EXECUTOR_ID); Try<PID<Slave>> slave = StartSlave(&exec); ASSERT_SOME(slave); auto scheduler = std::make_shared<MockV1HTTPScheduler>(); EXPECT_CALL(*scheduler, heartbeat(_)) .WillRepeatedly(Return()); // Ignore heartbeats. Future<Nothing> connected; EXPECT_CALL(*scheduler, connected(_)) .WillOnce(FutureSatisfy(&connected)) .WillRepeatedly(Return()); // Ignore future invocations. scheduler::TestV1Mesos mesos(master.get(), ContentType::PROTOBUF, scheduler); AWAIT_READY(connected); Future<Event::Subscribed> subscribed; EXPECT_CALL(*scheduler, subscribed(_, _)) .WillOnce(FutureArg<1>(&subscribed)); Future<Event::Offers> normalOffers; Future<Event::Offers> unavailabilityOffers; Future<Event::Offers> inverseOffers; EXPECT_CALL(*scheduler, offers(_, _)) .WillOnce(FutureArg<1>(&normalOffers)) .WillOnce(FutureArg<1>(&unavailabilityOffers)) .WillOnce(FutureArg<1>(&inverseOffers)); // The original offers should be rescinded when the unavailability is changed. Future<Nothing> offerRescinded; EXPECT_CALL(*scheduler, rescind(_, _)) .WillOnce(FutureSatisfy(&offerRescinded)); { Call call; call.set_type(Call::SUBSCRIBE); Call::Subscribe* subscribe = call.mutable_subscribe(); subscribe->mutable_framework_info()->CopyFrom(DEFAULT_V1_FRAMEWORK_INFO); mesos.send(call); } AWAIT_READY(subscribed); v1::FrameworkID frameworkId(subscribed->framework_id()); AWAIT_READY(normalOffers); EXPECT_NE(0, normalOffers->offers().size()); // Regular offers shouldn't have unavailability. foreach (const v1::Offer& offer, normalOffers->offers()) { EXPECT_FALSE(offer.has_unavailability()); } // Schedule this slave for maintenance. MachineID machine; machine.set_hostname(maintenanceHostname); machine.set_ip(stringify(slave.get().address.ip)); const Time start = Clock::now() + Seconds(60); const Duration duration = Seconds(120); const Unavailability unavailability = createUnavailability(start, duration); // Post a valid schedule with one machine. maintenance::Schedule schedule = createSchedule( {createWindow({machine}, unavailability)}); // We have a few seconds between the first set of offers and the // next allocation of offers. This should be enough time to perform // a maintenance schedule update. This update will also trigger the // rescinding of offers from the scheduled slave. Future<Response> response = process::http::post( master.get(), "maintenance/schedule", headers, stringify(JSON::protobuf(schedule))); AWAIT_EXPECT_RESPONSE_STATUS_EQ(OK().status, response); // The original offers should be rescinded when the unavailability // is changed. AWAIT_READY(offerRescinded); AWAIT_READY(unavailabilityOffers); EXPECT_NE(0, unavailabilityOffers->offers().size()); // Make sure the new offers have the unavailability set. foreach (const v1::Offer& offer, unavailabilityOffers->offers()) { EXPECT_TRUE(offer.has_unavailability()); EXPECT_EQ( unavailability.start().nanoseconds(), offer.unavailability().start().nanoseconds()); EXPECT_EQ( unavailability.duration().nanoseconds(), offer.unavailability().duration().nanoseconds()); } // We also expect an inverse offer for the slave to go under // maintenance. AWAIT_READY(inverseOffers); EXPECT_NE(0, inverseOffers->inverse_offers().size()); EXPECT_CALL(exec, shutdown(_)) .Times(AtMost(1)); EXPECT_CALL(*scheduler, disconnected(_)) .Times(AtMost(1)); Shutdown(); // Must shutdown before 'containerizer' gets deallocated. }
Also, unrelated, we need to clean up this test to not expect multiple offers i.e. remove numberOfOffers constant.
Attachments
Issue Links
- blocks
-
MESOS-4915 Mesos 0.28.0-rc2 cherry-picks
- Resolved