XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • yarn

    Description

      There are several paths that need to be improved with regard to the Docker container lifecycle when running Docker containers on YARN.

      1) Provide the ability to keep a container on the NodeManager for a set period of time for debugging purposes.
      2) Support sending signals to the process in the container to allow for triggering stack traces, heap dumps, etc.
      3) Support for Docker's live restore, which means moving away from the use of docker wait. (YARN-5818)
      4) Improve the resiliency of liveliness checks (kill -0) by adding retries.
      5) Improve the resiliency of container removal by adding retries.
      6) Only attempt to stop, kill, and remove containers if the current container state allows for it.
      7) Better handling of short lived containers when the container is stopped before the PID can be retrieved. (YARN-6305)

      Attachments

        1. YARN-5366.010.patch
          134 kB
          Shane Kumpf
        2. YARN-5366.009.patch
          130 kB
          Shane Kumpf
        3. YARN-5366.008.patch
          130 kB
          Shane Kumpf
        4. YARN-5366.007.patch
          130 kB
          Shane Kumpf
        5. YARN-5366.006.patch
          73 kB
          Shane Kumpf
        6. YARN-5366.005.patch
          71 kB
          Shane Kumpf
        7. YARN-5366.004.patch
          68 kB
          Shane Kumpf
        8. YARN-5366.003.patch
          67 kB
          Shane Kumpf
        9. YARN-5366.002.patch
          62 kB
          Shane Kumpf
        10. YARN-5366.001.patch
          61 kB
          Shane Kumpf

        Issue Links

          Activity

            People

              shanekumpf@gmail.com Shane Kumpf
              shanekumpf@gmail.com Shane Kumpf
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: