Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3464

Race condition in LocalizerRunner kills localizer before localizing all resources

    Details

    • Hadoop Flags:
      Reviewed

      Description

      Race condition in LocalizerRunner causes container localization timeout.
      Currently LocalizerRunner will kill the ContainerLocalizer when pending list for LocalizerResourceRequestEvent is empty.

            } else if (pending.isEmpty()) {
              action = LocalizerAction.DIE;
            }
      

      If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the ContainerLocalizer due to empty pending list, this LocalizerResourceRequestEvent will never be handled.
      Without ContainerLocalizer, LocalizerRunner#update will never be called.
      The container will stay at LOCALIZING state, until the container is killed by AM due to TASK_TIMEOUT.

        Attachments

        1. YARN-3464-branch-2.6.1.txt
          15 kB
          Vinod Kumar Vavilapalli
        2. YARN-3464.001.patch
          13 kB
          zhihai xu
        3. YARN-3464.000.patch
          11 kB
          zhihai xu

          Activity

            People

            • Assignee:
              zxu zhihai xu
              Reporter:
              zxu zhihai xu
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: