Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9667

Container-executor.c duplicates messages to stdout

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2.0
    • Fix Version/s: 3.3.0
    • Component/s: nodemanager, yarn
    • Labels:
      None
    • Target Version/s:

      Description

      When a container is killed by its AM we get a similar error message like this:

      2019-06-30 12:09:04,412 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 143. Privileged Execution Operation Stderr:
      
      Stdout: main : command provided 1
      main : run as user is systest
      main : requested yarn user is systest
      Getting exit code file...
      Creating script paths...
      Writing pid file...
      Writing to tmp file /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_000019/container_e84_1561921629886_0001_01_000019.pid.tmp
      Writing to cgroup task files...
      Creating local dirs...
      Launching container...
      Getting exit code file...
      Creating script paths...
      

      In container-executor.c the fork point is right after the "Creating script paths..." part, though in the Stdout log we can clearly see it has been written there twice. After consulting with Peter Bacsko it seems like there's a missing flush in container-executor.c before the fork and that causes the duplication.

      I suggest to add a flush there so that it won't be duplicated: it's a bit misleading that the child process writes out "Getting exit code file" and "Creating script paths" even though it is clearly not doing that.

      A more appealing solution could be to revisit the fprintf-fflush pairs in the code and change them to a single call, so that the fflush calls would not be forgotten accidentally. (It can cause problems in every place where it's used).

      Note: this issue probably affects every occasion of fork(), not just the one from launch_container_as_user in main.c.

        Attachments

        1. YARN-9667-001.patch
          59 kB
          Peter Bacsko

          Issue Links

            Activity

              People

              • Assignee:
                pbacsko Peter Bacsko
                Reporter:
                adam.antal Adam Antal
              • Votes:
                1 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: