Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-502

Slave crashes when handling duplicate terminal updates

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.13.0
    • 0.13.0
    • None
    • None

    Description

      Saw this in production at Twitter, where we allow duplicate terminal status updates.

      I0611 04:45:00.304193 11094 slave.cpp:1740] Handling status update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task 1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000
      I0611 04:45:00.304843 11094 status_update_manager.cpp:290] Received status update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task 1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000 with checkpoint=false
      I0611 04:45:00.304852 11099 cgroups_isolator.cpp:656] Changing cgroup controls for executor thermos-1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000 with resources cpus=0.25; mem=128; disk=0
      I0611 04:45:00.305250 11094 status_update_manager.cpp:336] Forwarding status update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task 1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000 to master@10.34.135.131:5050
      I0611 04:45:00.306172 11099 cgroups_isolator.cpp:853] Updated 'cpu.shares' to 255 for executor thermos-1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000
      I0611 04:45:00.307164 11099 cgroups_isolator.cpp:991] Updated 'memory.soft_limit_in_bytes' to 134217728 for executor thermos-1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000
      I0611 04:45:00.307320 11087 slave.cpp:1796] Status update manager successfully handled status update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task 1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000
      I0611 04:45:00.307601 11087 slave.cpp:1802] Sending acknowledgement for status update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task 1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000 to executor(1)@10.34.20.131:38573
      I0611 04:45:00.366597 11088 slave.cpp:1740] Handling status update TASK_FINISHED (UUID: 3bd6cbd7-b39c-4b83-81b5-b83c50fa4327) for task 1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of framework 201103282247-0000000019-0000
      F0611 04:45:00.367133 11088 slave.cpp:2964] Check failed: 'task' Must be non NULL

          • Check failure stack trace: ***
            @ 0x7f2f3ae09ddd google::LogMessage::Fail()
            @ 0x7f2f3ae0fa47 google::LogMessage::SendToLog()
            @ 0x7f2f3ae0b68c google::LogMessage::Flush()
            @ 0x7f2f3ae0b8f6 google::LogMessageFatal::~LogMessageFatal()
            @ 0x7f2f3aa5641d google::CheckNotNull<>()
            @ 0x7f2f3aad38b3 mesos::internal::slave::Executor::terminateTask()
            @ 0x7f2f3aaf425b mesos::internal::slave::Slave::statusUpdate()
            @ 0x7f2f3ab206fd ProtobufProcess<>::handler1<>()
            @ 0x7f2f3aafad8a std::tr1::_Function_handler<>::_M_invoke()
            @ 0x7f2f3ab2246b ProtobufProcess<>::visit()
            @ 0x7f2f3ad02ae5 process::ProcessManager::resume()
            @ 0x7f2f3ad0349f process::schedule()
            @ 0x7f2f3a4b773d start_thread

      Attachments

        Activity

          People

            vinodkone Vinod Kone
            vinodkone Vinod Kone
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: