Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9667

Container-executor.c duplicates messages to stdout

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.3.0, 3.2.2, 2.10.2
    • nodemanager, yarn
    • None

    Description

      When a container is killed by its AM we get a similar error message like this:

      2019-06-30 12:09:04,412 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 143. Privileged Execution Operation Stderr:
      
      Stdout: main : command provided 1
      main : run as user is systest
      main : requested yarn user is systest
      Getting exit code file...
      Creating script paths...
      Writing pid file...
      Writing to tmp file /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_000019/container_e84_1561921629886_0001_01_000019.pid.tmp
      Writing to cgroup task files...
      Creating local dirs...
      Launching container...
      Getting exit code file...
      Creating script paths...
      

      In container-executor.c the fork point is right after the "Creating script paths..." part, though in the Stdout log we can clearly see it has been written there twice. After consulting with pbacsko it seems like there's a missing flush in container-executor.c before the fork and that causes the duplication.

      I suggest to add a flush there so that it won't be duplicated: it's a bit misleading that the child process writes out "Getting exit code file" and "Creating script paths" even though it is clearly not doing that.

      A more appealing solution could be to revisit the fprintf-fflush pairs in the code and change them to a single call, so that the fflush calls would not be forgotten accidentally. (It can cause problems in every place where it's used).

      Note: this issue probably affects every occasion of fork(), not just the one from launch_container_as_user in main.c.

      Attachments

        1. YARN-9667-001.patch
          59 kB
          Peter Bacsko
        2. YARN-9667-branch-3.2.001.patch
          27 kB
          Eric Badger
        3. YARN-9667-branch-2.10.001.patch
          23 kB
          Eric Badger

        Issue Links

          Activity

            People

              pbacsko Peter Bacsko
              adam.antal Adam Antal
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: