Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-395

FaultToleranceTest.SchedulerFailoverFrameworkMessage test is flaky.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.13.0
    • None
    • None

    Description

      This failed on Jenkins, see output below.

      [ RUN ] FaultToleranceTest.SchedulerFailoverFrameworkMessage
      I0314 04:28:38.647588 15531 master.cpp:309] Master started on 67.195.138.60:53734
      I0314 04:28:38.647660 15531 master.cpp:324] Master ID: 201303140428-1015726915-53734-15499
      I0314 04:28:38.648110 15529 slave.cpp:202] Slave started on 53)@67.195.138.60:53734
      I0314 04:28:38.648202 15530 hierarchical_allocator_process.hpp:234] Initializing hierarchical allocator process with master : master@67.195.138.60:53734
      W0314 04:28:38.648231 15525 master.cpp:79] No whitelist given. Advertising offers for all slaves
      I0314 04:28:38.648370 15531 master.cpp:553] Elected as master!
      I0314 04:28:38.648437 15524 sched.cpp:182] New master at master@67.195.138.60:53734
      I0314 04:28:38.648470 15529 slave.cpp:203] Slave resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024
      I0314 04:28:38.651875 15527 master.cpp:596] Registering framework 201303140428-1015726915-53734-15499-0000 at scheduler(42)@67.195.138.60:53734
      I0314 04:28:38.652915 15524 sched.cpp:217] Framework registered with 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.652926 15527 hierarchical_allocator_process.hpp:266] Added framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.653851 15527 hierarchical_allocator_process.hpp:666] No resources available to allocate!
      I0314 04:28:38.655067 15527 hierarchical_allocator_process.hpp:597] Performed allocation for 0 slaves in 1.22ms
      I0314 04:28:38.652744 15529 slave.cpp:468] New master detected at master@67.195.138.60:53734
      I0314 04:28:38.656069 15529 slave.cpp:380] Finished recovery
      I0314 04:28:38.656072 15530 status_update_manager.cpp:131] New master detected at master@67.195.138.60:53734
      I0314 04:28:38.656733 15527 master.cpp:918] Attempting to register slave on janus.apache.org at slave(53)@67.195.138.60:53734
      I0314 04:28:38.657591 15527 master.cpp:1153] Master now considering a slave at janus.apache.org:53734 as active
      I0314 04:28:38.658236 15527 master.cpp:1729] Adding slave 201303140428-1015726915-53734-15499-0 at janus.apache.org with cpus=2; mem=1024; ports=[31000-32000]; disk=1024
      I0314 04:28:38.658807 15529 slave.cpp:502] Registered with master; given slave ID 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.658917 15526 hierarchical_allocator_process.hpp:393] Added slave 201303140428-1015726915-53734-15499-0 (janus.apache.org) with cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (and cpus=2; mem=1024; ports=[31000-32000]; disk=1024 available)
      I0314 04:28:38.660892 15526 hierarchical_allocator_process.hpp:658] Found available resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.662082 15526 hierarchical_allocator_process.hpp:684] Offering cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0 to framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.662931 15526 hierarchical_allocator_process.hpp:617] Performed allocation for slave 201303140428-1015726915-53734-15499-0 in 2.06ms
      I0314 04:28:38.662966 15527 master.hpp:305] Adding offer with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.663732 15527 master.cpp:1256] Sending 1 offers to framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.664222 15531 sched.cpp:282] Received 1 offers
      I0314 04:28:38.665107 15528 master.cpp:1463] Processing reply for offer 201303140428-1015726915-53734-15499-0 on slave 201303140428-1015726915-53734-15499-0 (janus.apache.org) for framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.665501 15528 master.hpp:285] Adding task with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.665874 15528 master.cpp:1580] Launching task 1 of framework 201303140428-1015726915-53734-15499-0000 with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0 (janus.apache.org)
      I0314 04:28:38.666404 15525 slave.cpp:614] Got assigned task 1 for framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.666474 15528 master.hpp:314] Removing offer with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.668900 15525 paths.hpp:302] Created executor directory '/tmp/FaultToleranceTest_SchedulerFailoverFrameworkMessage_A2apii/slaves/201303140428-1015726915-53734-15499-0/frameworks/201303140428-1015726915-53734-15499-0000/executors/default/runs/565e63a1-a4dd-4d93-8c13-6e85a20ec2de'
      I0314 04:28:38.669179 15527 slave.cpp:451] Successfully attached file '/tmp/FaultToleranceTest_SchedulerFailoverFrameworkMessage_A2apii/slaves/201303140428-1015726915-53734-15499-0/frameworks/201303140428-1015726915-53734-15499-0000/executors/default/runs/565e63a1-a4dd-4d93-8c13-6e85a20ec2de'
      I0314 04:28:38.669302 15525 exec.cpp:170] Executor started at: executor(17)@67.195.138.60:53734 with pid 15499
      I0314 04:28:38.670753 15525 slave.cpp:1008] Got registration for executor 'default' of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.671393 15525 slave.cpp:1083] Flushing queued tasks for framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.671438 15528 exec.cpp:194] Executor registered on slave 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.672396 15528 exec.cpp:258] Executor asked to run task '1'
      I0314 04:28:38.672799 15528 exec.cpp:382] Executor sending status update for task 1 in state TASK_RUNNING
      I0314 04:28:38.674502 15524 slave.cpp:1191] Handling status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.674540 15524 slave.cpp:1238] Forwarding status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000 to the status update manager
      I0314 04:28:38.675204 15528 status_update_manager.cpp:253] Received status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.675647 15528 status_update_manager.cpp:402] Creating StatusUpdate stream for task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.676182 15528 status_update_manager.hpp:314] Handling UPDATE for status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.676656 15528 status_update_manager.cpp:288] Forwarding status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000 to the master at master@67.195.138.60:53734
      I0314 04:28:38.677193 15531 master.cpp:1016] Status update from (266)@67.195.138.60:53734: task 1 of framework 201303140428-1015726915-53734-15499-0000 is now in state TASK_RUNNING
      I0314 04:28:38.677217 15529 slave.cpp:1300] Sending ACK for status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000 to executor executor(17)@67.195.138.60:53734
      I0314 04:28:38.677693 15527 sched.cpp:327] Status update: task 1 of framework 201303140428-1015726915-53734-15499-0000 is now in state TASK_RUNNING
      I0314 04:28:38.678215 15531 exec.cpp:289] Executor received ACK for status update of task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.679352 15527 slave.cpp:967] Got acknowledgement of status update for task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.679553 15529 sched.cpp:182] New master at master@67.195.138.60:53734
      I0314 04:28:38.680408 15527 status_update_manager.cpp:313] Received status update acknowledgement for task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.681324 15527 status_update_manager.hpp:314] Handling ACK for status update TASK_RUNNING from task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.680902 15528 master.cpp:631] Re-registering framework 201303140428-1015726915-53734-15499-0000 at scheduler(43)@67.195.138.60:53734
      I0314 04:28:38.682379 15528 master.cpp:650] Framework 201303140428-1015726915-53734-15499-0000 failed over
      I0314 04:28:38.683537 15527 sched.cpp:413] Got error 'Framework failed over'
      I0314 04:28:38.684315 15527 sched.cpp:446] Aborting framework '201303140428-1015726915-53734-15499-0000'
      W0314 04:28:38.684795 15527 master.cpp:748] scheduler(42)@67.195.138.60:53734 tried to deactivate framework; expecting scheduler(43)@67.195.138.60:53734
      I0314 04:28:38.683559 15529 sched.cpp:217] Framework registered with 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.683568 15528 slave.cpp:954] Updating framework 201303140428-1015726915-53734-15499-0000 pid to scheduler(43)@67.195.138.60:53734
      I0314 04:28:38.686815 15528 slave.cpp:1328] Sending message for framework 201303140428-1015726915-53734-15499-0000 to scheduler(43)@67.195.138.60:53734
      I0314 04:28:38.688077 15528 sched.cpp:401] Received framework message
      I0314 04:28:38.688607 15499 slave.cpp:389] Slave terminating
      I0314 04:28:38.688612 15526 sched.cpp:422] Stopping framework '201303140428-1015726915-53734-15499-0000'
      I0314 04:28:38.688647 15524 sched.cpp:422] Stopping framework '201303140428-1015726915-53734-15499-0000'
      I0314 04:28:38.689035 15499 slave.cpp:889] Asked to shut down framework 201303140428-1015726915-53734-15499-0000 by @0.0.0.0:0
      I0314 04:28:38.689618 15530 master.cpp:724] Asked to unregister framework 201303140428-1015726915-53734-15499-0000
      W0314 04:28:38.691184 15530 master.cpp:731] scheduler(42)@67.195.138.60:53734 tried to unregister framework; expecting scheduler(43)@67.195.138.60:53734
      I0314 04:28:38.690610 15499 slave.cpp:894] Shutting down framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.693922 15499 slave.cpp:1612] Shutting down executor 'default' of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.692955 15530 master.cpp:724] Asked to unregister framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.695543 15527 hierarchical_allocator_process.hpp:357] Deactivated framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.695054 15531 status_update_manager.cpp:232] Closing status update streams for framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.695585 15530 master.hpp:296] Removing task with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201303140428-1015726915-53734-15499-0
      I0314 04:28:38.697087 15530 master.cpp:477] Master terminating
      I0314 04:28:38.696569 15531 status_update_manager.cpp:433] Cleaning up status update stream for task 1 of framework 201303140428-1015726915-53734-15499-0000
      I0314 04:28:38.695039 15525 exec.cpp:321] Executor asked to shutdown
      pure virtual method called
      terminate called without an active exception
      I0314 04:28:38.697149 15526 hierarchical_allocator_process.hpp:542] Recovered cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (total allocatable: cpus=2; mem=1024; ports=[31000-32000]; disk=1024) on slave 201303140428-1015726915-53734-15499-0 from framework 201303140428-1015726915-53734-15499-0000
      /bin/bash: line 5: 15499 Aborted ${dir}$tst
      FAIL: mesos-tests

      Attachments

        Activity

          People

            benjaminhindman Benjamin Hindman
            bmahler Benjamin Mahler
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: