Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6805

NPE in LinuxContainerExecutor due to null PrivilegedOperationException exit code

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.1
    • Fix Version/s: 2.9.0, 3.0.0-beta1, 2.8.2
    • Component/s: nodemanager
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The LinuxContainerExecutor contains a number of code snippets like this:

          } catch (PrivilegedOperationException e) {
            int exitCode = e.getExitCode();
      

      PrivilegedOperationException#getExitCode can return null if the operation was interrupted, so when the JVM does auto-unboxing on that last line it can NPE if there was no exit code.

        Activity

        Hide
        jlowe Jason Lowe added a comment -

        Sample stack trace from a 2.8-based release:

        2017-07-10 20:39:12,810 [LocalizerRunner for container_e03_1496686551678_8189060_01_005998] WARN privileged.PrivilegedOperationExecutor: IOException executing command: 
        java.io.InterruptedIOException: java.lang.InterruptedException
                at org.apache.hadoop.util.Shell.runCommand(Shell.java:1007)
                at org.apache.hadoop.util.Shell.run(Shell.java:898)
                at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:151)
                at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:263)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1155)
        Caused by: java.lang.InterruptedException
                at java.lang.Object.wait(Native Method)
                at java.lang.Object.wait(Object.java:502)
                at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
                at org.apache.hadoop.util.Shell.runCommand(Shell.java:997)
                ... 5 more
        2017-07-10 20:39:12,811 [LocalizerRunner for container_e03_1496686551678_8189060_01_005998] INFO localizer.ResourceLocalizationService: Localizer failed
        java.lang.NullPointerException
                at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:267)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1155)
        
        Show
        jlowe Jason Lowe added a comment - Sample stack trace from a 2.8-based release: 2017-07-10 20:39:12,810 [LocalizerRunner for container_e03_1496686551678_8189060_01_005998] WARN privileged.PrivilegedOperationExecutor: IOException executing command: java.io.InterruptedIOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:1007) at org.apache.hadoop.util.Shell.run(Shell.java:898) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:151) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:263) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1155) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395) at org.apache.hadoop.util.Shell.runCommand(Shell.java:997) ... 5 more 2017-07-10 20:39:12,811 [LocalizerRunner for container_e03_1496686551678_8189060_01_005998] INFO localizer.ResourceLocalizationService: Localizer failed java.lang.NullPointerException at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:267) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1155)
        Hide
        jlowe Jason Lowe added a comment -

        Attaching a patch that removes the Integer support for an exit code and just always returns an int. All the code calling getExitCode for PrivilegedOperationException and ContainerExecutionException doesn't expect a null exit code, and ContainerExecutionException already has the precedent of using -1 to indicate a lack of an exit code.

        Show
        jlowe Jason Lowe added a comment - Attaching a patch that removes the Integer support for an exit code and just always returns an int. All the code calling getExitCode for PrivilegedOperationException and ContainerExecutionException doesn't expect a null exit code, and ContainerExecutionException already has the precedent of using -1 to indicate a lack of an exit code.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 16s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 14m 25s trunk passed
        +1 compile 0m 29s trunk passed
        +1 checkstyle 0m 18s trunk passed
        +1 mvnsite 0m 28s trunk passed
        -1 findbugs 0m 43s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 5 extant Findbugs warnings.
        +1 javadoc 0m 18s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 25s the patch passed
        +1 compile 0m 26s the patch passed
        +1 javac 0m 26s the patch passed
        +1 checkstyle 0m 16s the patch passed
        +1 mvnsite 0m 25s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 0m 50s the patch passed
        +1 javadoc 0m 18s the patch passed
              Other Tests
        +1 unit 12m 56s hadoop-yarn-server-nodemanager in the patch passed.
        +1 asflicense 0m 17s The patch does not generate ASF License warnings.
        34m 4s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:14b5c93
        JIRA Issue YARN-6805
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12876942/YARN-6805.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 6078cf5637c8 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / b628d0d
        Default Java 1.8.0_131
        findbugs v3.1.0-RC1
        findbugs https://builds.apache.org/job/PreCommit-YARN-Build/16396/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16396/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/16396/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 14m 25s trunk passed +1 compile 0m 29s trunk passed +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 28s trunk passed -1 findbugs 0m 43s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 5 extant Findbugs warnings. +1 javadoc 0m 18s trunk passed       Patch Compile Tests +1 mvninstall 0m 25s the patch passed +1 compile 0m 26s the patch passed +1 javac 0m 26s the patch passed +1 checkstyle 0m 16s the patch passed +1 mvnsite 0m 25s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 50s the patch passed +1 javadoc 0m 18s the patch passed       Other Tests +1 unit 12m 56s hadoop-yarn-server-nodemanager in the patch passed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 34m 4s Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue YARN-6805 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12876942/YARN-6805.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6078cf5637c8 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / b628d0d Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-YARN-Build/16396/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16396/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/16396/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        vvasudev Varun Vasudev added a comment -

        Doesn't look like the findbugs warnings are related to the patch. +1 from me.

        Show
        vvasudev Varun Vasudev added a comment - Doesn't look like the findbugs warnings are related to the patch. +1 from me.
        Hide
        shanekumpf@gmail.com Shane Kumpf added a comment -

        Thanks for addressing this, Jason Lowe.

        LGTM less one very minor nit: Missing a space in the not null check below.

        +        String output = e.getOutput();
        +        if (output!= null && !e.getOutput().isEmpty()) {
        +          builder.append("Shell output: " + output + "\n");
                 }
        
        Show
        shanekumpf@gmail.com Shane Kumpf added a comment - Thanks for addressing this, Jason Lowe . LGTM less one very minor nit: Missing a space in the not null check below. + String output = e.getOutput(); + if (output!= null && !e.getOutput().isEmpty()) { + builder.append( "Shell output: " + output + "\n" ); }
        Hide
        jlowe Jason Lowe added a comment -

        Thanks for the reviews! I'll fix the whitespace nit on the commit.

        Show
        jlowe Jason Lowe added a comment - Thanks for the reviews! I'll fix the whitespace nit on the commit.
        Hide
        jlowe Jason Lowe added a comment -

        I committed this to trunk, branch-2, branch-2.8, and branch-2.8.2.

        Show
        jlowe Jason Lowe added a comment - I committed this to trunk, branch-2, branch-2.8, and branch-2.8.2.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12003 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12003/)
        YARN-6805. NPE in LinuxContainerExecutor due to null (jlowe: rev f76f5c0919cdb0b032edb309d137093952e77268)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperationException.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerExecutionException.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java
          Revert "YARN-6805. NPE in LinuxContainerExecutor due to null (jlowe: rev 0ffca5d347df0acb1979dff7a07ae88ea834adc7)
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerExecutionException.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperationException.java
          YARN-6805. NPE in LinuxContainerExecutor due to null (jlowe: rev ebc048cc055d0f7d1b85bc0b6f56cd15673e837d)
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperationException.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerExecutionException.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12003 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12003/ ) YARN-6805 . NPE in LinuxContainerExecutor due to null (jlowe: rev f76f5c0919cdb0b032edb309d137093952e77268) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperationException.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerExecutionException.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java Revert " YARN-6805 . NPE in LinuxContainerExecutor due to null (jlowe: rev 0ffca5d347df0acb1979dff7a07ae88ea834adc7) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerExecutionException.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperationException.java YARN-6805 . NPE in LinuxContainerExecutor due to null (jlowe: rev ebc048cc055d0f7d1b85bc0b6f56cd15673e837d) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperationException.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerExecutionException.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java

          People

          • Assignee:
            jlowe Jason Lowe
            Reporter:
            jlowe Jason Lowe
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development