Hadoop Common
  1. Hadoop Common
  2. HADOOP-9818

Remove usage of "bash -c" from oah.fs.DF

    Details

    • Type: Improvement Improvement
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0, 2.1.0-beta
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Target Version/s:

      Description

      DF uses "bash -c" to shell out to the unix df command. This is potentially unsafe; let's think about removing it.

      1. HADOOP-9818.patch
        0.7 kB
        Kousuke Saruta

        Issue Links

          Activity

          Hide
          Kousuke Saruta added a comment -

          I've modified the code not to use "bash -c".

          Show
          Kousuke Saruta added a comment - I've modified the code not to use "bash -c".
          Hide
          Kousuke Saruta added a comment -

          I've removed "2>/dev/null" from the patch.
          No problem?

          Show
          Kousuke Saruta added a comment - I've removed "2>/dev/null" from the patch. No problem?
          Hide
          Andrew Wang added a comment -

          Hi Kousuke, thanks for the patch!

          I've made you a contributor, so now you should be able to assign Common JIRAs to yourself and mark them as Patch Available via the "Submit Patch" button.

          Part of the trickiness from removing "2>&1" is error handling. I haven't looked into DF at all, but ProcessBuilder separates stdout and stderr, so stderr messages no longer show up. So, I added another param to Shell in HADOOP-9652 (hopefully committed soon) which merges them back together.

          It'd be great if you can review the DF error handling behavior, especially adding tests for any missing holes you find.

          Show
          Andrew Wang added a comment - Hi Kousuke, thanks for the patch! I've made you a contributor, so now you should be able to assign Common JIRAs to yourself and mark them as Patch Available via the "Submit Patch" button. Part of the trickiness from removing "2>&1" is error handling. I haven't looked into DF at all, but ProcessBuilder separates stdout and stderr, so stderr messages no longer show up. So, I added another param to Shell in HADOOP-9652 (hopefully committed soon) which merges them back together. It'd be great if you can review the DF error handling behavior, especially adding tests for any missing holes you find.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12595486/HADOOP-9818.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2907//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2907//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595486/HADOOP-9818.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2907//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2907//console This message is automatically generated.
          Hide
          Kousuke Saruta added a comment -

          Andrew, Thanks!
          I'll try it.

          Show
          Kousuke Saruta added a comment - Andrew, Thanks! I'll try it.
          Hide
          Colin Patrick McCabe added a comment -

          The "bash -c" and attendant "2>/dev/null" were added by HADOOP-2344. I think the rationale was that having a lot of stderr output would potentially slow down execution (see https://issues.apache.org/jira/browse/HADOOP-1553?focusedCommentId=12514794&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12514794 )

          I think what we should do in this case is add an option to ShellCommandExceutor that redirects stderr to an instance of com.google.common.io.input.NullInputstream (or similar).

          Show
          Colin Patrick McCabe added a comment - The " bash -c " and attendant " 2>/dev/null " were added by HADOOP-2344 . I think the rationale was that having a lot of stderr output would potentially slow down execution (see https://issues.apache.org/jira/browse/HADOOP-1553?focusedCommentId=12514794&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12514794 ) I think what we should do in this case is add an option to ShellCommandExceutor that redirects stderr to an instance of com.google.common.io.input.NullInputstream (or similar).
          Hide
          Kousuke Saruta added a comment -

          I'm writing the test for DF and I've noticed things which may be defects.

          1. DF#getFilesystem() behaves differently depending on calling the method before / after calling DF#getMount().
          I think that is because getMount() calls DF#run() and DF#parseOutput() but getFilesystem() don't.

          2. DF#getMount() calls DF#run(). Thus, df command will be executed every we call DF#getMount().
          I think it's inefficient. Why only getMount() calls run() in itself?
          I think run() and parseOutput() should be called in constructor of DF so that df command will be executed when DF is instantiated.

          Show
          Kousuke Saruta added a comment - I'm writing the test for DF and I've noticed things which may be defects. 1. DF#getFilesystem() behaves differently depending on calling the method before / after calling DF#getMount(). I think that is because getMount() calls DF#run() and DF#parseOutput() but getFilesystem() don't. 2. DF#getMount() calls DF#run(). Thus, df command will be executed every we call DF#getMount(). I think it's inefficient. Why only getMount() calls run() in itself? I think run() and parseOutput() should be called in constructor of DF so that df command will be executed when DF is instantiated.

            People

            • Assignee:
              Kousuke Saruta
              Reporter:
              Andrew Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development