Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-1845

CommandInfo tasks may fail when scheduled after another task with the same id has finished.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      I created a little test framework where I wanted to experiment with scheduling tasks where running one task relies on the results of another, previously run task. So in my test framework I would first schedule a task that would append the string "foo" to a file, and after that one finishes I would schedule a task that appends "bar" to the same file.

      This worked well when using ExecutorInfo, but when I switched to using CommandInfo instead (specifying commands like 'echo foo >> /share/foobar.txt' in set_value()), it would most of the time fail in the second step when attempting to append "bar". Occasionally, but very rarely, it would work though.

      I couldn't find any meaningful log messages indicating what exactly went wrong. The slave log would indicate that the tasks status changed to TASK_FAILED and that that status update was sent correctly. The stdout log in the Sandbox would indicate that the command 'exited with status 0'.

      I could work around the issue when I specified task ids that were always unique. Previously I would reuse the id of a previously run task, one that appended "foo" to a file, after it finished in the followup task that would append "bar" to a file.

      It seems to me there might be something wrong when scheduling very short running tasks with the same id quickly after each other.

      Source code for my foobar framework:
      http://paste.ubuntu.com/8459083

      Build with:
      g++ -std=c++0x -g -Wall foobar_framework.cpp -I. -L/usr/local/lib -lmesos -o foobar-framework

      Attachments

        Activity

          People

            Unassigned Unassigned
            lazor Andreas Raster
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: