Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9496

Test `MasterTest.IgnoreOldAgentRegistration` is flaky.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.7.0
    • Fix Version/s: None
    • Component/s: test

      Description

      I1219 22:46:50.577191 12211 slave.cpp:261] Mesos agent started on (297)@172.16.10.209:35131
      ...
      I1219 22:46:50.583889 12211 slave.cpp:1884] Will retry registration in 1.9067ms if necessary
      I1219 22:46:50.584089 12213 slave.cpp:1884] Will retry registration in 17.254638ms if necessary
      I1219 22:46:50.584141 12213 master.cpp:6125] Received register agent message from slave(297)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal)
      ...
      I1219 22:46:50.584295 12213 master.cpp:6294] Registering agent at slave(297)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal) with id 651a538c-b971-4845-8118-b02143b6893d-S0
      I1219 22:46:50.584398 12213 registrar.cpp:495] Applied 1 operations in 33155ns; attempting to update the registry
      I1219 22:46:50.584533 12211 registrar.cpp:552] Successfully updated the registry in 0ns
      I1219 22:46:50.584583 12213 master.cpp:6342] Admitted agent 651a538c-b971-4845-8118-b02143b6893d-S0 at slave(297)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal)
      I1219 22:46:50.584719 12213 master.cpp:6391] Registered agent 651a538c-b971-4845-8118-b02143b6893d-S0 at slave(297)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal) with cpus:2; mem:1024; disk:1024; ports:[31000-32000]
      I1219 22:46:50.584772 12213 slave.cpp:1486] Registered with master master@172.16.10.209:35131; given agent ID 651a538c-b971-4845-8118-b02143b6893d-S0
      ...
      I1219 22:46:50.586488 12211 master.cpp:6125] Received register agent message from slave(297)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal)
      ...
      W1219 22:46:50.586627 12211 master.cpp:6220] Ignoring registration attempt from old agent at slave(297)@172.16.10.209:35131: agent version is 0.28.1-rc1, minimum supported agent version is 1.0.0
      ...
      I1219 22:46:50.593817 12188 slave.cpp:921] Agent terminating
      I1219 22:46:50.593991 12215 slave.cpp:261] Mesos agent started on (298)@172.16.10.209:35131
      ...
      I1219 22:46:50.604478 12213 slave.cpp:1884] Will retry registration in 8.287701ms if necessary
      I1219 22:46:50.604524 12210 master.cpp:6467] Received re-register agent message from agent 651a538c-b971-4845-8118-b02143b6893d-S0 at slave(298)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal)
      ...
      I1219 22:46:50.604694 12210 master.cpp:6634] Agent is already marked as registered: 651a538c-b971-4845-8118-b02143b6893d-S0 at slave(298)@172.16.10.209:35131 (ip-172-16-10-209.ec2.internal)
      I1219 22:46:50.604763 12210 master.cpp:6989] Registry updated for slave 651a538c-b971-4845-8118-b02143b6893d-S0 at slave(298)@172.16.10.209:35131(ip-172-16-10-209.ec2.internal)
      ...
      I1219 22:46:50.605032 12211 slave.cpp:1589] Re-registered with master master@172.16.10.209:35131
      ...
      ../../src/tests/master_tests.cpp:8427: Failure
      Failed to wait 15secs for slaveRegisteredMessage
      ...
      ../../src/tests/master_tests.cpp:8418: Failure
      Actual function call count doesn't match EXPECT_CALL(*master.get()->registrar, apply(_))...
      Expected: to be called once
      Actual: never called - unsatisfied and active
      ../../3rdparty/libprocess/include/process/gmock.hpp:467: Failure
      Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(testing::A<const MessageEvent&>()))...
      Expected args: message matcher (8-byte object <88-BE 07-88 17-7F 00-00>, 1-byte object <BE>, 1-byte object <88>)
      Expected: to be called once
      Actual: never called - unsatisfied and active

      This is because the retried registration arrives the master before we pass the altered registration message to the master, making the restarted agent trigger a reregistration instead of a registration.

      Full log: mesos-ec2-centos-6-SSL.Mesos.MasterTest.IgnoreOldAgentRegistration-badrun.txt

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              chhsia0 Chun-Hung Hsiao
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: