Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5863

Killing task attempts while speculation is enabled can cause the job to fail

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      There could be a race condition when a T_ADD_SPEC_ATTEMPT is being fired, the task gets succeeded and then killed by the client. In that case, the task state changes from SUCCEEDED to SCHEDULED, and then task gets a T_ADD_SPEC_ATTEMPT event, which is invalid for SCHEDULED state.

      1. Task is running.
      2. Speculator fires a T_ADD_SPEC_ATTEMPT
      3. Before task receives T_ADD_SPEC_ATTEMPT, it succeeds
      4. Succeeded TA receives TA_KILL from client. Now the task is at SCHEDULED state.
      5. Task receives T_ADD_SPEC_ATTEMPT, since this is an unexpected event, the job fails.

        Activity

        Hide
        Mingzhe Hao added a comment -

        instead of being killed, the succeeded TA can also fail to change the task into SCHEDULED state

        Show
        Mingzhe Hao added a comment - instead of being killed, the succeeded TA can also fail to change the task into SCHEDULED state

          People

          • Assignee:
            Unassigned
            Reporter:
            Mingzhe Hao
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development