[MESOS-1845] CommandInfo tasks may fail when scheduled after another task with the same id has finished. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

I created a little test framework where I wanted to experiment with scheduling tasks where running one task relies on the results of another, previously run task. So in my test framework I would first schedule a task that would append the string "foo" to a file, and after that one finishes I would schedule a task that appends "bar" to the same file.

This worked well when using ExecutorInfo, but when I switched to using CommandInfo instead (specifying commands like 'echo foo >> /share/foobar.txt' in set_value()), it would most of the time fail in the second step when attempting to append "bar". Occasionally, but very rarely, it would work though.

I couldn't find any meaningful log messages indicating what exactly went wrong. The slave log would indicate that the tasks status changed to TASK_FAILED and that that status update was sent correctly. The stdout log in the Sandbox would indicate that the command 'exited with status 0'.

I could work around the issue when I specified task ids that were always unique. Previously I would reuse the id of a previously run task, one that appended "foo" to a file, after it finished in the followup task that would append "bar" to a file.

It seems to me there might be something wrong when scheduling very short running tasks with the same id quickly after each other.

Source code for my foobar framework:
http://paste.ubuntu.com/8459083

Build with:
g++ -std=c++0x -g -Wall foobar_framework.cpp -I. -L/usr/local/lib -lmesos -o foobar-framework

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Andreas Raster

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 30/Sep/14 12:09

Updated:: 15/Sep/15 08:49