Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9022

Race condition in task updates could cause missing event in streaming

    XMLWordPrintableJSON

Details

    • Important

    Description

      Master sends update event of TASK_STARTING when task's latest state is already TASK_FAILED. Then when it handles the update of TASK_FAILED, sendSubscribersUpdate is set to false because of this. The subscriber would not receive update event of TASK_FAILED.

      This happened when a task failed very fast. Is there a race condition while handling task updates?

      master log:

      I0622 13:08:29.189771 84079 master.cpp:8345] Status update TASK_STARTING (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 from agent d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
       I0622 13:08:29.189801 84079 master.cpp:8402] Forwarding status update TASK_STARTING (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000
       I0622 13:08:29.190004 84079 master.cpp:10843] Updating the state of task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (latest state: TASK_STARTING, status update state: TASK_STARTING)
       I0622 13:08:29.603857 84079 master.cpp:6195] Processing ACKNOWLEDGE call for status eb091093-d303-4e82-b69f-e2ba1011ba76 for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (Aurora) on agent d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
       I0622 13:08:29.615643 84079 master.cpp:8345] Status update TASK_STARTING (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 from agent d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
       I0622 13:08:29.615669 84079 master.cpp:8402] Forwarding status update TASK_STARTING (Status UUID: eb091093-d303-4e82-b69f-e2ba1011ba76) for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000
       I0622 13:08:29.615783 84079 master.cpp:10843] Updating the state of task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (latest state: TASK_FAILED, status update state: TASK_STARTING)
       I0622 13:08:29.620837 84079 master.cpp:8345] Status update TASK_FAILED (Status UUID: ac34f1e9-eaa4-4765-82ac-7398c2e6c835) for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 from agent d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
       I0622 13:08:29.620853 84079 master.cpp:8402] Forwarding status update TASK_FAILED (Status UUID: ac34f1e9-eaa4-4765-82ac-7398c2e6c835) for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000
       I0622 13:08:29.620923 84079 master.cpp:10843] Updating the state of task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (latest state: TASK_FAILED, status update state: TASK_FAILED)
       I0622 13:08:29.630455 84079 master.cpp:6195] Processing ACKNOWLEDGE call for status eb091093-d303-4e82-b69f-e2ba1011ba76 for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (Aurora) on agent d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587
       I0622 13:08:29.673051 84095 master.cpp:6195] Processing ACKNOWLEDGE call for status ac34f1e9-eaa4-4765-82ac-7398c2e6c835 for task f839055c-7a40-4e6c-9f53-22030f388c8c of framework 4591ea8b-4adb-4acf-bb29-b70817663c4e-0000 (Aurora) on agent d2f1c7c2-668d-46e5-829b-ce614cca79ae-S1587

       

      Attachments

        Issue Links

          Activity

            People

              bennoe Benno Evers
              evelynl Evelyn Liu
              Zhitao Li Zhitao Li
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: