Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-521

AllocatorTest/0.SchedulerFailover is flaky on Jenkins.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 0.16.0
    • Component/s: None
    • Labels:
    • Sprint:
      Q2 Sprint 5

      Description

      [ RUN ] AllocatorTest/0.SchedulerFailover
      I0624 02:55:47.214609 361 master.cpp:228] Master started on 67.195.138.8:38688
      I0624 02:55:47.215337 360 slave.cpp:219] Slave started on 91)@67.195.138.8:38688
      I0624 02:55:47.419751 360 slave.cpp:220] Slave resources: cpus=3; mem=1024; ports=[31000-32000]; disk=21413
      I0624 02:55:47.313890 361 master.cpp:243] Master ID: 201306240255-143311683-38688-32750
      I0624 02:55:47.215630 366 sched.cpp:177] New master at master@67.195.138.8:38688
      I0624 02:55:47.606534 360 slave.cpp:540] New master detected at master@67.195.138.8:38688
      W0624 02:55:47.710382 359 master.cpp:83] No whitelist given. Advertising offers for all slaves
      I0624 02:55:47.915772 362 status_update_manager.cpp:155] New master detected at master@67.195.138.8:38688
      I0624 02:55:47.710644 361 master.cpp:526] Elected as master!
      I0624 02:55:47.915735 360 slave.cpp:555] Postponing registration until recovery is complete
      I0624 02:55:47.710510 365 hierarchical_allocator_process.hpp:295] Initializing hierarchical allocator process with master : master@67.195.138.8:38688
      I0624 02:55:48.021198 361 master.cpp:569] Registering framework 201306240255-143311683-38688-32750-0000 at scheduler(82)@67.195.138.8:38688
      I0624 02:55:48.121747 360 slave.cpp:401] Finished recovery
      I0624 02:55:48.366871 364 sched.cpp:222] Framework registered with 201306240255-143311683-38688-32750-0000
      I0624 02:55:48.366956 361 hierarchical_allocator_process.hpp:327] Added framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:48.472203 366 master.cpp:891] Attempting to register slave on minerva.apache.org at slave(91)@67.195.138.8:38688
      I0624 02:55:48.571797 361 hierarchical_allocator_process.hpp:705] No resources available to allocate!
      I0624 02:55:48.674368 366 master.cpp:1851] Adding slave 201306240255-143311683-38688-32750-0 at minerva.apache.org with cpus=3; mem=1024; ports=[31000-32000]; disk=21413
      I0624 02:55:48.779265 361 hierarchical_allocator_process.hpp:667] Performed allocation for 0 slaves in 207.45138ms
      I0624 02:55:48.974411 363 slave.cpp:600] Registered with master master@67.195.138.8:38688; given slave ID 201306240255-143311683-38688-32750-0
      I0624 02:55:49.080335 361 hierarchical_allocator_process.hpp:449] Added slave 201306240255-143311683-38688-32750-0 (minerva.apache.org) with cpus=3; mem=1024; ports=[31000-32000]; disk=21413 (and cpus=3; mem=1024; ports=[31000-32000]; disk=21413 available)
      I0624 02:55:49.184074 361 hierarchical_allocator_process.hpp:727] Offering cpus=3; mem=1024; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0 to framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:49.292635 361 hierarchical_allocator_process.hpp:687] Performed allocation for slave 201306240255-143311683-38688-32750-0 in 108.576789ms
      I0624 02:55:49.292698 363 master.hpp:313] Adding offer 201306240255-143311683-38688-32750-0 with resources cpus=3; mem=1024; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:49.397990 361 hierarchical_allocator_process.hpp:667] Performed allocation for 1 slaves in 35.152us
      I0624 02:55:49.503993 363 master.cpp:1239] Sending 1 offers to framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:49.713198 363 master.cpp:1472] Processing reply for offer 201306240255-143311683-38688-32750-0 on slave 201306240255-143311683-38688-32750-0 (minerva.apache.org) for framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:49.817643 363 master.hpp:291] Adding task 0 with resources cpus=1; mem=256 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:49.920374 363 master.cpp:1591] Launching task 0 of framework 201306240255-143311683-38688-32750-0000 with resources cpus=1; mem=256 on slave 201306240255-143311683-38688-32750-0 (minerva.apache.org)
      I0624 02:55:49.920603 363 master.hpp:323] Removing offer 201306240255-143311683-38688-32750-0 with resources cpus=3; mem=1024; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:49.920593 366 slave.cpp:740] Got assigned task 0 for framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:49.920693 359 hierarchical_allocator_process.hpp:526] Framework 201306240255-143311683-38688-32750-0000 left cpus=2; mem=768; ports=[31000-32000]; disk=21413 unused on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:50.127504 366 slave.cpp:838] Launching task 0 for framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:50.332365 366 paths.hpp:303] Created executor directory '/tmp/AllocatorTest_0_SchedulerFailover_plfc9z/slaves/201306240255-143311683-38688-32750-0/frameworks/201306240255-143311683-38688-32750-0000/executors/default/runs/61d60eb1-04f6-4ea6-97d9-02a38c9974d1'
      I0624 02:55:50.332571 366 slave.cpp:949] Queuing task '0' for executor default of framework '201306240255-143311683-38688-32750-0000
      I0624 02:55:50.332744 359 exec.cpp:170] Executor started at: executor(31)@67.195.138.8:38688 with pid 32750
      I0624 02:55:50.435502 366 slave.cpp:522] Successfully attached file '/tmp/AllocatorTest_0_SchedulerFailover_plfc9z/slaves/201306240255-143311683-38688-32750-0/frameworks/201306240255-143311683-38688-32750-0000/executors/default/runs/61d60eb1-04f6-4ea6-97d9-02a38c9974d1'
      I0624 02:55:50.606742 364 hierarchical_allocator_process.hpp:727] Offering cpus=2; mem=768; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0 to framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:50.643476 366 slave.cpp:1396] Got registration for executor 'default' of framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:50.752272 366 slave.cpp:1511] Flushing queued task 0 for executor 'default' of framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:50.752284 364 hierarchical_allocator_process.hpp:667] Performed allocation for 1 slaves in 145.598165ms
      I0624 02:55:50.752336 359 exec.cpp:194] Executor registered on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:50.752429 360 master.hpp:313] Adding offer 201306240255-143311683-38688-32750-1 with resources cpus=2; mem=768; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:50.790041 359 exec.cpp:258] Executor asked to run task '0'
      I0624 02:55:50.790209 360 master.cpp:1239] Sending 1 offers to framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:50.790595 362 sched.cpp:427] Stopping framework '201306240255-143311683-38688-32750-0000'
      I0624 02:55:50.790699 364 master.cpp:1472] Processing reply for offer 201306240255-143311683-38688-32750-1 on slave 201306240255-143311683-38688-32750-0 (minerva.apache.org) for framework 201306240255-143311683-38688-32750-0000
      ../../src/tests/allocator_tests.cpp:605: Failure
      Mock function called more times than expected - taking default action specified at:
      ../../src/tests/mesos.hpp:361:
      Function call: resourcesUnused(@0x2af7a8009158 201306240255-143311683-38688-32750-0000, @0x2af7a8009138 201306240255-143311683-38688-32750-0, @0x2af7a8009100

      { cpus=2, mem=768, ports=[31000-32000], disk=21413 }

      , @0x2af7a80090f0 16-byte object <00-00 00-00 F7-2A 00-00 50-CD 00-A8 F7-2A 00-00>)
      Expected: to be called once
      Actual: called twice - over-saturated and active
      I0624 02:55:50.895730 364 master.hpp:323] Removing offer 201306240255-143311683-38688-32750-1 with resources cpus=2; mem=768; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:50.895859 365 hierarchical_allocator_process.hpp:526] Framework 201306240255-143311683-38688-32750-0000 left cpus=2; mem=768; ports=[31000-32000]; disk=21413 unused on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:50.996470 364 master.cpp:396] Framework 201306240255-143311683-38688-32750-0000 disconnected
      I0624 02:55:51.098886 365 hierarchical_allocator_process.hpp:569] Framework 201306240255-143311683-38688-32750-0000 filtered slave 201306240255-143311683-38688-32750-0 for 5secs
      I0624 02:55:51.203402 364 master.cpp:423] Giving framework 201306240255-143311683-38688-32750-0000 100ms to failover
      I0624 02:55:51.204043 359 sched.cpp:177] New master at master@67.195.138.8:38688
      I0624 02:55:51.307276 365 hierarchical_allocator_process.hpp:412] Deactivated framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:51.410742 359 master.cpp:604] Re-registering framework 201306240255-143311683-38688-32750-0000 at scheduler(83)@67.195.138.8:38688
      I0624 02:55:51.510154 359 master.cpp:623] Framework 201306240255-143311683-38688-32750-0000 failed over
      I0624 02:55:51.610339 366 sched.cpp:222] Framework registered with 201306240255-143311683-38688-32750-0000
      I0624 02:55:51.610365 362 hierarchical_allocator_process.hpp:382] Activated framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:51.610366 363 slave.cpp:1278] Updating framework 201306240255-143311683-38688-32750-0000 pid to scheduler(83)@67.195.138.8:38688
      I0624 02:55:51.898102 362 hierarchical_allocator_process.hpp:727] Offering cpus=2; mem=768; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0 to framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:51.933804 362 hierarchical_allocator_process.hpp:667] Performed allocation for 1 slaves in 35.745082ms
      I0624 02:55:51.933874 364 master.hpp:313] Adding offer 201306240255-143311683-38688-32750-2 with resources cpus=2; mem=768; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:52.039295 362 hierarchical_allocator_process.hpp:667] Performed allocation for 1 slaves in 13.376us
      I0624 02:55:52.139583 364 master.cpp:1239] Sending 1 offers to framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:52.139994 366 sched.cpp:427] Stopping framework '201306240255-143311683-38688-32750-0000'
      I0624 02:55:52.140022 32750 master.cpp:385] Master terminating
      I0624 02:55:52.339004 32750 master.cpp:207] Shutting down master
      I0624 02:55:52.339058 364 slave.cpp:496] Slave asked to shut down by master@67.195.138.8:38688
      I0624 02:55:52.540321 364 slave.cpp:1113] Asked to shut down framework 201306240255-143311683-38688-32750-0000 by master@67.195.138.8:38688
      I0624 02:55:52.440243 363 hierarchical_allocator_process.hpp:412] Deactivated framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:52.440027 32750 master.hpp:303] Removing task 0 with resources cpus=1; mem=256 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:52.682425 364 slave.cpp:1138] Shutting down framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:52.996059 364 slave.cpp:2329] Shutting down executor 'default' of framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:52.894083 362 hierarchical_allocator_process.hpp:616] Recovered cpus=1; mem=256 (total allocatable: cpus=1; mem=256; ports=[]; disk=0) on slave 201306240255-143311683-38688-32750-0 from framework 201306240255-143311683-38688-32750-0000
      W0624 02:55:52.916483 359 master.cpp:83] No whitelist given. Advertising offers for all slaves
      I0624 02:55:52.893936 32750 master.hpp:323] Removing offer 201306240255-143311683-38688-32750-2 with resources cpus=2; mem=768; ports=[31000-32000]; disk=21413 on slave 201306240255-143311683-38688-32750-0
      I0624 02:55:52.996163 364 slave.cpp:1885] master@67.195.138.8:38688 exited
      I0624 02:55:52.996206 366 exec.cpp:323] Executor asked to shutdown
      I0624 02:55:53.098610 362 hierarchical_allocator_process.hpp:616] Recovered cpus=2; mem=768; ports=[31000-32000]; disk=21413 (total allocatable: cpus=3; mem=1024; ports=[31000-32000]; disk=21413) on slave 201306240255-143311683-38688-32750-0 from framework 201306240255-143311683-38688-32750-0000
      W0624 02:55:53.408087 364 slave.cpp:1888] Master disconnected! Waiting for a new master to be elected
      I0624 02:55:53.611656 362 hierarchical_allocator_process.hpp:667] Performed allocation for 1 slaves in 15.485us
      I0624 02:55:53.611745 364 slave.cpp:451] Slave terminating
      I0624 02:55:53.743916 362 hierarchical_allocator_process.hpp:367] Removed framework 201306240255-143311683-38688-32750-0000
      I0624 02:55:53.846712 364 slave.cpp:1113] Asked to shut down framework 201306240255-143311683-38688-32750-0000 by @0.0.0.0:0
      I0624 02:55:53.946197 362 hierarchical_allocator_process.hpp:477] Removed slave 201306240255-143311683-38688-32750-0
      W0624 02:55:54.046665 364 slave.cpp:1134] Ignoring shutdown framework 201306240255-143311683-38688-32750-0000 because it is terminating
      [ FAILED ] AllocatorTest/0.SchedulerFailover, where TypeParam = mesos::internal::master::allocator::HierarchicalAllocatorProcess<mesos::internal::master::allocator::DRFSorter, mesos::internal::master::allocator::DRFSorter> (7036 ms)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vinodkone Vinod Kone
                Reporter:
                bmahler Benjamin Mahler
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: