Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13770

Shell.checkIsBashSupported swallowed an interrupted exception

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Shell.checkIsBashSupported() creates a bash shell command to verify if the system supports bash. However, its error message is misleading, and the logic should be updated.

      If the shell command throws an IOException, it does not imply the bash did not run successfully. If the shell command process was interrupted, its internal logic throws an InterruptedIOException, which is a subclass of IOException.

      Shell.checkIsBashSupported
          ShellCommandExecutor shexec;
          boolean supported = true;
          try {
            String[] args = {"bash", "-c", "echo 1000"};
            shexec = new ShellCommandExecutor(args);
            shexec.execute();
          } catch (IOException ioe) {
            LOG.warn("Bash is not supported by the OS", ioe);
            supported = false;
          }
      

      An example of it appeared in a recent jenkins job
      https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/

      The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a thread, wait it for 1 second, and interrupt the thread, expecting the thread to terminate. However, the method Shell.checkIsBashSupported swallowed the interrupt, and therefore failed.

      2015-12-16 21:31:53,797 WARN  util.Shell (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS
      java.io.InterruptedIOException: java.lang.InterruptedException
      	at org.apache.hadoop.util.Shell.runCommand(Shell.java:930)
      	at org.apache.hadoop.util.Shell.run(Shell.java:838)
      	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
      	at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716)
      	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:705)
      	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
      	at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639)
      	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
      	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
      	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803)
      	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773)
      	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646)
      	at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397)
      	at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350)
      	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330)
      	at org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115)
      Caused by: java.lang.InterruptedException
      	at java.lang.Object.wait(Native Method)
      	at java.lang.Object.wait(Object.java:503)
      	at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264)
      	at org.apache.hadoop.util.Shell.runCommand(Shell.java:920)
      	... 15 more
      

      The original design is not desirable, as it swallowed a potential interrupt, causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail. Unfortunately, Java does not allow this static method to throw exception. We should removed the static member variable, so that the method can throw the interrupt exception. The node manager should call the static method, instead of using the static member variable.

      This fix has an associated benefit: the tests could run faster, because it will no longer need to spawn a bash process when it uses a Shell static method variable (which happens quite often for checking what operating system Hadoop is running on)

        Attachments

        1. YARN-4467.001.patch
          2 kB
          Wei-Chiu Chuang
        2. HADOOP-12652.001.patch
          0.8 kB
          Wei-Chiu Chuang

          Issue Links

            Activity

              People

              • Assignee:
                jojochuang Wei-Chiu Chuang
                Reporter:
                jojochuang Wei-Chiu Chuang
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: