Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-985

FaultToleranceTest.IgnoreKillTaskFromUnregisteredFramework is flaky

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.18.0
    • 0.18.0
    • None
    • None

    Description

      [ RUN ] FaultToleranceTest.IgnoreKillTaskFromUnregisteredFramework
      I0210 16:47:28.314275 27200 master.cpp:285] Master started on 127.0.0.1:33802
      I0210 16:47:28.314314 27200 master.cpp:300] Master ID: 201402101647-16777343-33802-27175
      I0210 16:47:28.314322 27200 master.cpp:303] Master only allowing authenticated frameworks to register!
      I0210 16:47:28.314806 27200 hierarchical_allocator_process.hpp:302] Initializing hierarchical allocator process with master : master@127.0.0.1:33802
      I0210 16:47:28.314843 27200 master.cpp:86] No whitelist given. Advertising offers for all slaves
      I0210 16:47:28.314975 27200 master.cpp:738] The newly elected leader is master@127.0.0.1:33802
      I0210 16:47:28.314986 27200 master.cpp:746] Elected as the leading master!
      I0210 16:47:28.316339 27199 slave.cpp:112] Slave started on 110)@127.0.0.1:33802
      I0210 16:47:28.316421 27199 slave.cpp:212] Slave resources: cpus:2; mem:1024; disk:1024; ports:[31000-32000]
      I0210 16:47:28.316964 27199 state.cpp:33] Recovering state from '/tmp/FaultToleranceTest_IgnoreKillTaskFromUnregisteredFramework_oWtPdM/meta'
      I0210 16:47:28.317072 27199 sched.cpp:210] New master detected at master@127.0.0.1:33802
      I0210 16:47:28.317091 27199 sched.cpp:261] Authenticating with master master@127.0.0.1:33802
      I0210 16:47:28.317131 27199 sched.cpp:230] Detecting new master
      I0210 16:47:28.317159 27199 status_update_manager.cpp:188] Recovering status update manager
      I0210 16:47:28.317206 27199 authenticatee.hpp:124] Creating new client SASL connection
      I0210 16:47:28.317301 27199 master.cpp:2299] Authenticating framework at scheduler(91)@127.0.0.1:33802
      I0210 16:47:28.317376 27199 slave.cpp:2753] Finished recovery
      I0210 16:47:28.317675 27197 authenticator.hpp:140] Creating new server SASL connection
      I0210 16:47:28.317738 27197 authenticatee.hpp:212] Received SASL authentication mechanisms: CRAM-MD5
      I0210 16:47:28.317751 27197 authenticatee.hpp:238] Attempting to authenticate with mechanism 'CRAM-MD5'
      I0210 16:47:28.317775 27197 authenticator.hpp:243] Received SASL authentication start
      I0210 16:47:28.317814 27197 authenticator.hpp:325] Authentication requires more steps
      I0210 16:47:28.317839 27197 authenticatee.hpp:258] Received SASL authentication step
      I0210 16:47:28.317872 27197 authenticator.hpp:271] Received SASL authentication step
      I0210 16:47:28.317888 27197 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'localhost.localdomain' server FQDN: 'localhost.localdomain' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
      I0210 16:47:28.317895 27197 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
      I0210 16:47:28.317905 27197 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
      I0210 16:47:28.317914 27197 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'localhost.localdomain' server FQDN: 'localhost.localdomain' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
      I0210 16:47:28.317920 27197 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
      I0210 16:47:28.317926 27197 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
      I0210 16:47:28.317937 27197 authenticator.hpp:317] Authentication success
      I0210 16:47:28.317961 27197 authenticatee.hpp:298] Authentication success
      I0210 16:47:28.317982 27197 master.cpp:2339] Successfully authenticated framework at scheduler(91)@127.0.0.1:33802
      I0210 16:47:28.318042 27197 sched.cpp:335] Successfully authenticated with master master@127.0.0.1:33802
      I0210 16:47:28.318088 27197 master.cpp:794] Received registration request from scheduler(91)@127.0.0.1:33802
      I0210 16:47:28.318109 27197 master.cpp:812] Registering framework 201402101647-16777343-33802-27175-0000 at scheduler(91)@127.0.0.1:33802
      I0210 16:47:28.318158 27197 sched.cpp:384] Framework registered with 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.318179 27197 sched.cpp:398] Scheduler::registered took 8273ns
      I0210 16:47:28.318207 27197 hierarchical_allocator_process.hpp:332] Added framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.318215 27197 hierarchical_allocator_process.hpp:726] No resources available to allocate!
      I0210 16:47:28.318222 27197 hierarchical_allocator_process.hpp:688] Performed allocation for 0 slaves in 6478ns
      I0210 16:47:28.318552 27199 slave.cpp:501] New master detected at master@127.0.0.1:33802
      I0210 16:47:28.318971 27199 slave.cpp:526] Detecting new master
      I0210 16:47:28.319080 27199 status_update_manager.cpp:162] New master detected at master@127.0.0.1:33802
      I0210 16:47:28.319322 27199 master.cpp:1816] Attempting to register slave on localhost.localdomain at slave(110)@127.0.0.1:33802
      I0210 16:47:28.319632 27199 master.cpp:2786] Adding slave 201402101647-16777343-33802-27175-0 at localhost.localdomain with cpus:2; mem:1024; disk:1024; ports:[31000-32000]
      I0210 16:47:28.319936 27197 slave.cpp:544] Registered with master master@127.0.0.1:33802; given slave ID 201402101647-16777343-33802-27175-0
      I0210 16:47:28.320317 27197 hierarchical_allocator_process.hpp:445] Added slave 201402101647-16777343-33802-27175-0 (localhost.localdomain) with cpus:2; mem:1024; disk:1024; ports:[31000-32000] (and cpus:2; mem:1024; disk:1024; ports:[31000-32000] available)
      I0210 16:47:28.320374 27197 hierarchical_allocator_process.hpp:752] Offering cpus:2; mem:1024; disk:1024; ports:[31000-32000] on slave 201402101647-16777343-33802-27175-0 to framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.320463 27197 hierarchical_allocator_process.hpp:708] Performed allocation for slave 201402101647-16777343-33802-27175-0 in 109544ns
      I0210 16:47:28.320691 27199 master.hpp:431] Adding offer 201402101647-16777343-33802-27175-0 with resources cpus:2; mem:1024; disk:1024; ports:[31000-32000] on slave 201402101647-16777343-33802-27175-0 (localhost.localdomain)
      I0210 16:47:28.320945 27199 master.cpp:2254] Sending 1 offers to framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.321365 27198 sched.cpp:518] Scheduler::resourceOffers took 127195ns
      I0210 16:47:28.321665 27199 master.hpp:441] Removing offer 201402101647-16777343-33802-27175-0 with resources cpus:2; mem:1024; disk:1024; ports:[31000-32000] on slave 201402101647-16777343-33802-27175-0 (localhost.localdomain)
      I0210 16:47:28.321925 27199 master.cpp:1550] Processing reply for offers: [ 201402101647-16777343-33802-27175-0 ] on slave 201402101647-16777343-33802-27175-0 (localhost.localdomain) for framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.322175 27199 master.hpp:403] Adding task 0 with resources cpus:1; mem:512 on slave 201402101647-16777343-33802-27175-0 (localhost.localdomain)
      I0210 16:47:28.322388 27199 master.cpp:2423] Launching task 0 of framework 201402101647-16777343-33802-27175-0000 with resources cpus:1; mem:512 on slave 201402101647-16777343-33802-27175-0 (localhost.localdomain)
      I0210 16:47:28.322751 27196 slave.cpp:729] Got assigned task 0 for framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.322907 27196 slave.cpp:838] Launching task 0 for framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.325883 27196 slave.cpp:948] Queuing task '0' for executor default of framework '201402101647-16777343-33802-27175-0000
      I0210 16:47:28.326372 27196 slave.cpp:466] Successfully attached file '/tmp/FaultToleranceTest_IgnoreKillTaskFromUnregisteredFramework_oWtPdM/slaves/201402101647-16777343-33802-27175-0/frameworks/201402101647-16777343-33802-27175-0000/executors/default/runs/645369af-2374-45e2-baea-ef722717f461'
      I0210 16:47:28.327816 27198 exec.cpp:178] Executor started at: executor(34)@127.0.0.1:33802 with pid 27175
      I0210 16:47:28.327880 27197 slave.cpp:1424] Got registration for executor 'default' of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.327960 27197 slave.cpp:1545] Flushing queued task 0 for executor 'default' of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.328025 27197 exec.cpp:202] Executor registered on slave 201402101647-16777343-33802-27175-0
      I0210 16:47:28.329026 27197 exec.cpp:214] Executor::registered took 16316ns
      I0210 16:47:28.329082 27197 exec.cpp:289] Executor asked to run task '0'
      I0210 16:47:28.329283 27198 hierarchical_allocator_process.hpp:547] Framework 201402101647-16777343-33802-27175-0000 left cpus:1; mem:512; disk:1024; ports:[31000-32000] unused on slave 201402101647-16777343-33802-27175-0
      I0210 16:47:28.329397 27198 hierarchical_allocator_process.hpp:590] Framework 201402101647-16777343-33802-27175-0000 filtered slave 201402101647-16777343-33802-27175-0 for 5secs
      I0210 16:47:28.329782 27197 exec.cpp:298] Executor::launchTask took 685284ns
      I0210 16:47:28.330807 27197 exec.cpp:521] Executor sending status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.330996 27197 slave.cpp:1758] Handling status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000 from executor(34)@127.0.0.1:33802
      I0210 16:47:28.331171 27197 status_update_manager.cpp:314] Received status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.331326 27197 status_update_manager.cpp:493] Creating StatusUpdate stream for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.331468 27197 status_update_manager.cpp:367] Forwarding status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000 to master@127.0.0.1:33802
      I0210 16:47:28.331754 27197 master.cpp:2002] Status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000 from slave(110)@127.0.0.1:33802
      I0210 16:47:28.331867 27197 slave.cpp:1877] Status update manager successfully handled status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.332947 27197 slave.cpp:1883] Sending acknowledgement for status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000 to executor(34)@127.0.0.1:33802
      I0210 16:47:28.332767 27195 sched.cpp:210] New master detected at master@127.0.0.1:33802
      I0210 16:47:28.333220 27195 sched.cpp:261] Authenticating with master master@127.0.0.1:33802
      I0210 16:47:28.333292 27195 sched.cpp:230] Detecting new master
      I0210 16:47:28.333341 27195 authenticatee.hpp:124] Creating new client SASL connection
      I0210 16:47:28.333413 27195 master.cpp:2299] Authenticating framework at scheduler(92)@127.0.0.1:33802
      I0210 16:47:28.333487 27195 authenticator.hpp:140] Creating new server SASL connection
      I0210 16:47:28.333542 27195 authenticatee.hpp:212] Received SASL authentication mechanisms: CRAM-MD5
      I0210 16:47:28.333561 27195 authenticatee.hpp:238] Attempting to authenticate with mechanism 'CRAM-MD5'
      I0210 16:47:28.333644 27195 authenticator.hpp:243] Received SASL authentication start
      I0210 16:47:28.333689 27195 authenticator.hpp:325] Authentication requires more steps
      I0210 16:47:28.333717 27195 authenticatee.hpp:258] Received SASL authentication step
      I0210 16:47:28.333751 27195 authenticator.hpp:271] Received SASL authentication step
      I0210 16:47:28.333767 27195 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'localhost.localdomain' server FQDN: 'localhost.localdomain' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
      I0210 16:47:28.333776 27195 auxprop.cpp:153] Looking up auxiliary property '*userPassword'
      I0210 16:47:28.333787 27195 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
      I0210 16:47:28.333797 27195 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'localhost.localdomain' server FQDN: 'localhost.localdomain' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
      I0210 16:47:28.333804 27195 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
      I0210 16:47:28.333809 27195 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
      I0210 16:47:28.333822 27195 authenticator.hpp:317] Authentication success
      I0210 16:47:28.333847 27195 authenticatee.hpp:298] Authentication success
      I0210 16:47:28.333894 27195 master.cpp:2339] Successfully authenticated framework at scheduler(92)@127.0.0.1:33802
      I0210 16:47:28.333958 27195 sched.cpp:335] Successfully authenticated with master master@127.0.0.1:33802
      I0210 16:47:28.334008 27195 master.cpp:882] Re-registering framework 201402101647-16777343-33802-27175-0000 at scheduler(92)@127.0.0.1:33802
      I0210 16:47:28.334022 27195 master.cpp:908] Framework 201402101647-16777343-33802-27175-0000 failed over
      I0210 16:47:28.334070 27195 sched.cpp:384] Framework registered with 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.334090 27195 sched.cpp:398] Scheduler::registered took 8071ns
      I0210 16:47:28.332135 27196 sched.cpp:609] Scheduler::statusUpdate took 17741ns
      I0210 16:47:28.334801 27196 master.cpp:1655] Asked to kill task 0 of framework 201402101647-16777343-33802-27175-0000
      W0210 16:47:28.334915 27196 master.cpp:1668] Ignoring kill task message for task 0 of framework 201402101647-16777343-33802-27175-0000 from 'scheduler(91)@127.0.0.1:33802' because it is not from the registered framework 'scheduler(92)@127.0.0.1:33802'
      I0210 16:47:28.336783 27200 exec.cpp:521] Executor sending status update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.337124 27197 slave.cpp:2091] Monitoring executor default of framework 201402101647-16777343-33802-27175-0000 forked at pid 27175
      I0210 16:47:28.337548 27197 slave.cpp:1305] Updating framework 201402101647-16777343-33802-27175-0000 pid to scheduler(92)@127.0.0.1:33802
      I0210 16:47:28.337986 27197 slave.cpp:1758] Handling status update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000 from executor(34)@127.0.0.1:33802
      I0210 16:47:28.338248 27197 slave.cpp:3207] Terminating task 0
      W0210 16:47:28.338119 27196 status_update_manager.cpp:176] Resending status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.338500 27196 status_update_manager.cpp:367] Forwarding status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000 to master@127.0.0.1:33802
      I0210 16:47:28.337339 27199 exec.cpp:335] Executor received status update acknowledgement 774ad391-5824-436b-9996-55ccada6af81 for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.338857 27198 master.cpp:2002] Status update TASK_RUNNING (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000 from slave(110)@127.0.0.1:33802
      I0210 16:47:28.338925 27198 sched.cpp:609] Scheduler::statusUpdate took 13726ns
      I0210 16:47:28.339025 27196 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      tests/fault_tolerance_tests.cpp:1499: Failure
      Value of: status2.get().state()
      Actual: TASK_RUNNING
      Expected: TASK_FINISHED
      I0210 16:47:28.339249 27175 master.cpp:565] Master terminating
      I0210 16:47:28.339272 27175 master.cpp:210] Shutting down master
      I0210 16:47:28.339306 27175 master.hpp:421] Removing task 0 with resources cpus:1; mem:512 on slave 201402101647-16777343-33802-27175-0 (localhost.localdomain)
      I0210 16:47:28.339367 27175 master.cpp:253] Removing slave 201402101647-16777343-33802-27175-0 (localhost.localdomain)
      I0210 16:47:28.339643 27196 sched.cpp:720] Stopping framework '201402101647-16777343-33802-27175-0000'
      I0210 16:47:28.339690 27196 sched.cpp:720] Stopping framework '201402101647-16777343-33802-27175-0000'
      I0210 16:47:28.340121 27197 slave.cpp:1364] Status update manager successfully handled status update acknowledgement (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.340422 27197 slave.cpp:1958] master@127.0.0.1:33802 exited
      W0210 16:47:28.340677 27197 slave.cpp:1961] Master disconnected! Waiting for a new master to be elected
      I0210 16:47:28.340220 27196 status_update_manager.cpp:314] Received status update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.340899 27196 status_update_manager.cpp:367] Forwarding status update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000 to master@127.0.0.1:33802
      I0210 16:47:28.340978 27196 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000
      W0210 16:47:28.340996 27196 status_update_manager.hpp:259] Duplicate status update acknowledgment (UUID: 774ad391-5824-436b-9996-55ccada6af81) for update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.341030 27196 slave.cpp:1877] Status update manager successfully handled status update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.341042 27196 slave.cpp:1883] Sending acknowledgement for status update TASK_FINISHED (UUID: 071c1ff5-e724-44f0-9098-d15c886ca333) for task 0 of framework 201402101647-16777343-33802-27175-0000 to executor(34)@127.0.0.1:33802
      E0210 16:47:28.341071 27196 slave.cpp:1357] Failed to handle status update acknowledgement (UUID: 774ad391-5824-436b-9996-55ccada6af81) for task 0 of framework 201402101647-16777343-33802-27175-0000: Duplicate acknowledgement
      I0210 16:47:28.341099 27196 exec.cpp:335] Executor received status update acknowledgement 071c1ff5-e724-44f0-9098-d15c886ca333 for task 0 of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.355067 27175 slave.cpp:391] Slave terminating
      I0210 16:47:28.355098 27175 slave.cpp:1135] Asked to shut down framework 201402101647-16777343-33802-27175-0000 by @0.0.0.0:0
      I0210 16:47:28.355110 27175 slave.cpp:1160] Shutting down framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.355124 27175 slave.cpp:2424] Shutting down executor 'default' of framework 201402101647-16777343-33802-27175-0000
      I0210 16:47:28.355348 27175 exec.cpp:375] Executor asked to shutdown
      I0210 16:47:28.355371 27175 exec.cpp:390] Executor::shutdown took 8016ns

      Attachments

        Activity

          People

            vinodkone Vinod Kone
            vinodkone Vinod Kone
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: