Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-434

Process isolator libprocess throws exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      This occurred during one of the slave recovery tests that calls slave shutdown.

      The process isolator terminated with the following error.

      libprocess: process-isolator(379)@10.37.184.103:37325 terminating due to basic_filebuf::underflow error reading the file
      

      Relevant test output, for posterity

      I0413 20:35:29.195838 19301 exec.cpp:321] Executor asked to shutdown
      I0413 20:35:29.195864 54312 status_update_manager.cpp:359] Received status update acknowledgement for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.195948 54312 status_update_manager.hpp:298] Checkpointing ACK for status update TASK_RUNNING from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.195983 19311 exec.cpp:75] Scheduling shutdown of the executor
      Waited on process 19319, returned status 15
      I0413 20:35:29.196202 19312 exec.cpp:382] Executor sending status update for task 055a6671-8348-4b72-9fde-1c7ff667fa5c in state TASK_FAILED
      I0413 20:35:29.196606 54312 status_update_manager.hpp:329] Handling ACK for status update TASK_RUNNING from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.196769 54313 slave.cpp:1093] Status update manager successfully handled status update acknowledgement for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.198374 54291 slave.cpp:1433] Handling status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.198786 54312 status_update_manager.cpp:288] Received status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 with checkpoint=true
      I0413 20:35:29.198886 54312 status_update_manager.hpp:298] Checkpointing UPDATE for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.199542 54312 status_update_manager.hpp:329] Handling UPDATE for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.199611 54312 status_update_manager.cpp:334] Forwarding status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 to the master at master@10.37.184.103:37325
      I0413 20:35:29.199827 54298 master.cpp:1086] Status update from (4588)@10.37.184.103:37325: task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 is now in state TASK_FAILED
      I0413 20:35:29.199836 54291 slave.cpp:1494] Sending ACK for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 to executor executor(1)@10.37.184.103:42359
      I0413 20:35:29.200048 54306 sched.cpp:327] Received status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 from slave(740)@10.37.184.103:37325
      I0413 20:35:29.200098 54298 master.hpp:300] Removing task with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201304132035-1740121354-37325-54287-0
      I0413 20:35:29.200218 54306 sched.cpp:360] Sending ACK for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 to slave(740)@10.37.184.103:37325
      I0413 20:35:29.200290 19318 process.cpp:878] Socket closed while receiving
      I0413 20:35:29.200345 19302 exec.cpp:283] Ignoring ACK for status update of task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000 because the driver is aborted!
      I0413 20:35:29.200369 54293 slave.cpp:1056] Got acknowledgement of status update for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.200515 54292 hierarchical_allocator_process.hpp:544] Recovered cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (total allocatable: cpus=2; mem=1024; ports=[31000-32000]; disk=1024) on slave 201304132035-1740121354-37325-54287-0 from framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.200598 54310 status_update_manager.cpp:359] Received status update acknowledgement for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.200690 54310 status_update_manager.hpp:298] Checkpointing ACK for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.201251 54310 status_update_manager.hpp:329] Handling ACK for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.201344 54310 status_update_manager.cpp:481] Cleaning up status update stream for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.201529 54310 slave.cpp:1093] Status update manager successfully handled status update acknowledgement for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.205672 54292 hierarchical_allocator_process.hpp:660] Found available resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201304132035-1740121354-37325-54287-0
      I0413 20:35:29.205793 54292 hierarchical_allocator_process.hpp:686] Offering cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201304132035-1740121354-37325-54287-0 to framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.206107 54292 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 478.92us
      I0413 20:35:29.206265 54293 master.hpp:309] Adding offer with resources cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 201304132035-1740121354-37325-54287-0
      I0413 20:35:29.206428 54293 master.cpp:1327] Sending 1 offers to framework 201304132035-1740121354-37325-54287-0000
      I0413 20:35:29.206624 54293 sched.cpp:282] Received 1 offers
      I0413 20:35:29.215816 54292 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:29.215885 54292 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 106.44us
      I0413 20:35:29.226014 54301 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:29.226085 54301 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 103.05us
      W0413 20:35:29.236067 54298 master.cpp:81] No whitelist given. Advertising offers for all slaves
      I0413 20:35:29.236207 54293 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:29.236299 54293 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 146.62us
      I0413 20:35:29.236767 54312 monitor.cpp:206] Publishing resource usage for executor '055a6671-8348-4b72-9fde-1c7ff667fa5c' of framework '201304132035-1740121354-37325-54287-0000'
      I0413 20:35:29.246213 54309 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:29.246271 54309 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 87.00us
      I0413 20:35:29.256467 54297 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:29.256556 54297 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 129.68us
      ..................
      ..................
       20:35:30.149065 54296 master.cpp:81] No whitelist given. Advertising offers for all slaves
      I0413 20:35:30.149341 54310 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:30.149401 54310 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 102.93us
      I0413 20:35:30.149641 54296 monitor.cpp:206] Publishing resource usage for executor '055a6671-8348-4b72-9fde-1c7ff667fa5c' of framework '201304132035-1740121354-37325-54287-0000'
      I0413 20:35:30.159216 54311 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:30.159286 54311 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 120.69us
      I0413 20:35:30.169424 54304 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:30.169503 54304 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 119.82us
      I0413 20:35:30.179558 54301 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:30.179635 54301 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 106.45us
      I0413 20:35:30.189718 54297 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:30.189789 54297 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 106.73us
      I0413 20:35:30.197923 54316 process.cpp:878] Socket closed while receiving
      W0413 20:35:30.199760 54299 master.cpp:81] No whitelist given. Advertising offers for all slaves
      I0413 20:35:30.199870 54311 hierarchical_allocator_process.hpp:668] No resources available to allocate!
      I0413 20:35:30.199952 54311 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 121.93us
      libprocess: process-isolator(379)@10.37.184.103:37325 terminating due to basic_filebuf::underflow error reading the file
      W0413 20:35:30.200603 54310 monitor.cpp:212] Failed to collect resource usage for executor '055a6671-8348-4b72-9fde-1c7ff667fa5c' of framework '201304132035-1740121354-37325-54287-0000': 0
      

      Attachments

        Issue Links

          Activity

            People

              xujyan Yan Xu
              vinodkone Vinod Kone
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: