Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6846

Nodemanager can fail to fully delete application local directories when applications are killed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.8.1
    • 2.9.0, 3.0.0-beta1, 2.8.2
    • nodemanager
    • None

    Description

      When an application is killed all of the running containers are killed and the app waits for the containers to complete before cleaning up. As each container completes the container directory is deleted via the DeletionService. After all containers have completed the app completes and the app directory is deleted. If the app completes quickly enough then the deletion of the container and app directories can race against each other. If the container deletion executor deletes a file just before the application deletion executor then it can cause the application deletion executor to fail, leaving the remaining entries in the application directory lingering.

      Attachments

        1. YARN-6846.003.patch
          8 kB
          Jason Darrell Lowe
        2. YARN-6846.002.patch
          8 kB
          Jason Darrell Lowe
        3. YARN-6846.001.patch
          8 kB
          Jason Darrell Lowe

        Activity

          People

            jlowe Jason Darrell Lowe
            jlowe Jason Darrell Lowe
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: