Hadoop Common
  1. Hadoop Common
  2. HADOOP-3281

bin/hadoop script should check class name before running java

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: scripts
    • Labels:
      None

      Description

      When the first parameter ($1) cannot be matched with one of existing hadoop commnads, the parameter will be considered as a class name and the script will pass it to java. For examples,

      bash-3.2$ ./bin/hadoop -version
      java version "1.5.0_14"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_14-b03)
      Java HotSpot(TM) Client VM (build 1.5.0_14-b03, mixed mode)
      bash-3.2$ ./bin/hadoop -help   
      Usage: java [-options] class [args...]
                 (to execute a class)
         or  java [-options] -jar jarfile [args...]
                 (to execute a jar file)
      ...
      

      The behavior above is confusing. We should check whether the parameter is a valid class name before passing it to java.

      1. 3281_v01.patch
        6 kB
        Edward J. Yoon
      2. 3281.patch
        0.4 kB
        Edward J. Yoon

        Issue Links

          Activity

          Tsz Wo Nicholas Sze created issue -
          Edward J. Yoon made changes -
          Field Original Value New Value
          Assignee Edward J. Yoon [ udanax ]
          Hide
          Edward J. Yoon added a comment -

          I added a "Unknown Command" message instead of direct java execution as describe below.

          $ bin/hadoop -version
          Unknown Command : Type 'hadoop' without any parameters to see usage.
          
          $ bin/hadoop asdf
          Unknown Command : Type 'hadoop' without any parameters to see usage.
          
          
          Show
          Edward J. Yoon added a comment - I added a "Unknown Command" message instead of direct java execution as describe below. $ bin/hadoop -version Unknown Command : Type 'hadoop' without any parameters to see usage. $ bin/hadoop asdf Unknown Command : Type 'hadoop' without any parameters to see usage.
          Edward J. Yoon made changes -
          Attachment 3281.patch [ 12381244 ]
          Hide
          Edward J. Yoon added a comment -

          Submitting to TRUNK.

          Show
          Edward J. Yoon added a comment - Submitting to TRUNK.
          Edward J. Yoon made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.18.0 [ 12312972 ]
          Hide
          Doug Cutting added a comment -

          > We should check whether the parameter is a valid class name before passing it to java.

          I agree, but I don't see a way to do that in the bin/hadoop script.

          In HADOOP-435 we discussed moving all the 'if [$COMMAND = ...]' logic into a Java class, so that bin/hadoop might always invoke something like org.apache.hadoop.util.ToolRunner, which contains the 'if ((command.equals(...)))' logic. This is another good reason to do that, since, once you're in Java, it's easy to detect if a command names a class or not, and generate a nice error message if it does not.

          So I think the best fix for this is to (at long last) move the command dispatch from bin/hadoop into a Java class.

          Show
          Doug Cutting added a comment - > We should check whether the parameter is a valid class name before passing it to java. I agree, but I don't see a way to do that in the bin/hadoop script. In HADOOP-435 we discussed moving all the 'if [$COMMAND = ...] ' logic into a Java class, so that bin/hadoop might always invoke something like org.apache.hadoop.util.ToolRunner, which contains the 'if ((command.equals(...)))' logic. This is another good reason to do that, since, once you're in Java, it's easy to detect if a command names a class or not, and generate a nice error message if it does not. So I think the best fix for this is to (at long last) move the command dispatch from bin/hadoop into a Java class.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Class name is a QualifiedIdentifier defined in
          http://java.sun.com/docs/books/jls/third_edition/html/syntax.html
          (see also http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.8)
          Could we use a regular expression to check it? For simplicity, we could ignore special cases like java keywords.

          > So I think the best fix for this is to (at long last) move the command dispatch from bin/hadoop into a Java class.

          I agree. This is easier.

          Show
          Tsz Wo Nicholas Sze added a comment - Class name is a QualifiedIdentifier defined in http://java.sun.com/docs/books/jls/third_edition/html/syntax.html (see also http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#3.8 ) Could we use a regular expression to check it? For simplicity, we could ignore special cases like java keywords. > So I think the best fix for this is to (at long last) move the command dispatch from bin/hadoop into a Java class. I agree. This is easier.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Edward, valid class names are disallowed in your patch. Then, something like "./bin/hadoop org.apache.hadoop.dfs.NNBench" won't work anymore. We have to check for class names.

          Show
          Tsz Wo Nicholas Sze added a comment - Edward, valid class names are disallowed in your patch. Then, something like "./bin/hadoop org.apache.hadoop.dfs.NNBench" won't work anymore. We have to check for class names.
          Hide
          Edward J. Yoon added a comment -

          >> So I think the best fix for this is to (at long last) move the command dispatch from bin/hadoop into a Java class.
          >> Then, something like "./bin/hadoop org.apache.hadoop.dfs.NNBench" won't work anymore. We have to check for class names.

          Ooops. Sorry, i was overlooked a detail.

          Show
          Edward J. Yoon added a comment - >> So I think the best fix for this is to (at long last) move the command dispatch from bin/hadoop into a Java class. >> Then, something like "./bin/hadoop org.apache.hadoop.dfs.NNBench" won't work anymore. We have to check for class names. Ooops. Sorry, i was overlooked a detail.
          Edward J. Yoon made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12381244/3281.patch
          against trunk revision 645773.

          @author +1. The patch does not contain any @author tags.

          tests included -1. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new javac compiler warnings.

          release audit +1. The applied patch does not generate any new release audit warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12381244/3281.patch against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2359/console This message is automatically generated.
          Hide
          Edward J. Yoon added a comment -
          [root@udanax hadoop]# bin/hadoop
          Usage: hadoop [--config confdir] COMMAND
          where COMMAND is one of:
            namenode -format     format the DFS filesystem
          ...
          
          [root@udanax hadoop]# bin/hadoop org.apache.hadoop.Tester arg1 arg2
          org.apache.hadoop.Tester: command/class not found
          [root@udanax hadoop]# bin/hadoop tester
          tester: command/class not found
          [root@udanax hadoop]# bin/hadoop org.apache.hadoop.dfs.NNBenchs
          org.apache.hadoop.dfs.NNBenchs: command/class not found
          [root@udanax hadoop]# bin/hadoop org.apache.hadoop.dfs.NNBench
          NameNode Benchmark 0.4
          08/05/02 11:05:11 INFO dfs.NNBench: Test Inputs: 
          08/05/02 11:05:11 INFO dfs.NNBench:            Test Operation: none
          08/05/02 11:05:11 INFO dfs.NNBench:                Start time: 2008-05-02 11:07:11,219
          ...
          
          [root@udanax hadoop]# bin/hadoop dfs -ls /
          Found 6 items
          1    212752      2008-05-02 10:52  -rw-r--r--  root  supergroup  /CHANGES.txt
          ...
          
          Show
          Edward J. Yoon added a comment - [root@udanax hadoop]# bin/hadoop Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: namenode -format format the DFS filesystem ... [root@udanax hadoop]# bin/hadoop org.apache.hadoop.Tester arg1 arg2 org.apache.hadoop.Tester: command/class not found [root@udanax hadoop]# bin/hadoop tester tester: command/class not found [root@udanax hadoop]# bin/hadoop org.apache.hadoop.dfs.NNBenchs org.apache.hadoop.dfs.NNBenchs: command/class not found [root@udanax hadoop]# bin/hadoop org.apache.hadoop.dfs.NNBench NameNode Benchmark 0.4 08/05/02 11:05:11 INFO dfs.NNBench: Test Inputs: 08/05/02 11:05:11 INFO dfs.NNBench: Test Operation: none 08/05/02 11:05:11 INFO dfs.NNBench: Start time: 2008-05-02 11:07:11,219 ... [root@udanax hadoop]# bin/hadoop dfs -ls / Found 6 items 1 212752 2008-05-02 10:52 -rw-r--r-- root supergroup /CHANGES.txt ...
          Edward J. Yoon made changes -
          Attachment 3281_v01.patch [ 12381295 ]
          Hide
          Edward J. Yoon added a comment -

          Hmm, Should vmArgs moved to java code, too?

          Show
          Edward J. Yoon added a comment - Hmm, Should vmArgs moved to java code, too?
          Hide
          Doug Cutting added a comment -

          Ideally there should be no per-command information in bin/hadoop. When one adds or alters a command, one should only have to edit one file. So the OPTS stuff should move into Java too, as should all the help strings.

          Show
          Doug Cutting added a comment - Ideally there should be no per-command information in bin/hadoop. When one adds or alters a command, one should only have to edit one file. So the OPTS stuff should move into Java too, as should all the help strings.
          Hide
          Edward J. Yoon added a comment -

          Thank you. BTW, How does this "$HADOOP_NAMENODE_OPTS" work?
          I can't find the related source code with it.

          if [ "$COMMAND" = "namenode" ] ; then
            CLASS='org.apache.hadoop.dfs.NameNode'
            HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"
          
          Show
          Edward J. Yoon added a comment - Thank you. BTW, How does this "$HADOOP_NAMENODE_OPTS" work? I can't find the related source code with it. if [ "$COMMAND" = "namenode" ] ; then CLASS='org.apache.hadoop.dfs.NameNode' HADOOP_OPTS= "$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"
          Mukund Madhugiri made changes -
          Fix Version/s 0.18.0 [ 12312972 ]
          Hide
          Harsh J added a comment -

          I think a better approach would be to observe java's return code and throw out an appropriate message? Apart from making the 'feature' of running classes from 'hadoop classpath' via 'hadoop clazz' more visible via docs/helpstrings.

          Show
          Harsh J added a comment - I think a better approach would be to observe java's return code and throw out an appropriate message? Apart from making the 'feature' of running classes from 'hadoop classpath' via 'hadoop clazz' more visible via docs/helpstrings.
          Allen Wittenauer made changes -
          Link This issue is related to HADOOP-9902 [ HADOOP-9902 ]
          Allen Wittenauer made changes -
          Link This issue is related to HADOOP-9902 [ HADOOP-9902 ]
          Allen Wittenauer made changes -
          Link This issue is related to HADOOP-11092 [ HADOOP-11092 ]
          Hide
          Allen Wittenauer added a comment -

          I'm going to close this as won't fix. A few reasons:

          1) Users actually really like being able to use arbitrary classes on the CLI.

          2) As the hadoop code base gets more complex, it's going to be an increasingly long time to analyze the classpath to see if a method exists.

          3) A lot of recent changes in shell code in trunk/3.x and the requirement that classes passed need to belong to a package have greatly reduced the amount of times this particular problem is hit.

          Show
          Allen Wittenauer added a comment - I'm going to close this as won't fix. A few reasons: 1) Users actually really like being able to use arbitrary classes on the CLI. 2) As the hadoop code base gets more complex, it's going to be an increasingly long time to analyze the classpath to see if a method exists. 3) A lot of recent changes in shell code in trunk/3.x and the requirement that classes passed need to belong to a package have greatly reduced the amount of times this particular problem is hit.
          Allen Wittenauer made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]

            People

            • Assignee:
              Edward J. Yoon
              Reporter:
              Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development