Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8472 YARN Container Phase 2
  3. YARN-6495

check docker container's exit code when writing to cgroup task files

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • nodemanager

    Description

      If I execute simple command like date on docker container, the application failed to complete successfully.

      for example,

      $ yarn  jar $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -num_containers 1 -timeout 3600000
      
      …
      17/04/12 00:16:40 INFO distributedshell.Client: Application did finished unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring loop
      17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to complete successfully
      

      The error log is like below.

      ...
      Failed to write pid to file /cgroup_parent/cpu/hadoop-yarn/container_xxxx/tasks - No such process
      ...
      

      When writing pid to cgroup tasks, container-executor doesn’t check docker container’s status.
      If the container finished very quickly, we can’t write pid to cgroup tasks, and it is not problem.
      So container-executor needs to check docker container’s exit code during writing pid to cgroup tasks.

      Attachments

        1. YARN-6495.003.patch
          6 kB
          Jaeboo Jeong
        2. YARN-6495.002.patch
          3 kB
          Jaeboo Jeong
        3. YARN-6495.001.patch
          2 kB
          Jaeboo Jeong

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jbrennan Jim Brennan
            Jaeboo Jaeboo Jeong
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment