Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Release Note:
      CmdShell - An interactive command-line shell for Hadoop's "Fs" utilities

      Description

      A shell that allows the user to execute multiple filesystem operations in a single JVM instance at a prompt.

      1. HADOOP-6541.2.patch
        37 kB
        Aaron Kimball
      2. HADOOP-6541.3.patch
        38 kB
        Aaron Kimball
      3. HADOOP-6541.4.patch
        35 kB
        Aaron Kimball
      4. HADOOP-6541.5.patch
        35 kB
        Aaron Kimball
      5. HADOOP-6541.6.patch
        35 kB
        Todd Lipcon
      6. HADOOP-6541.patch
        38 kB
        Aaron Kimball

        Issue Links

          Activity

          Hide
          Konstantin Shvachko added a comment -

          Should we close this as duplicate of Impala?
          Eli was on the cleaning up spree lately.

          JLine sound like a good idea.
          Be aware of licensing issues though.

          Show
          Konstantin Shvachko added a comment - Should we close this as duplicate of Impala? Eli was on the cleaning up spree lately. JLine sound like a good idea. Be aware of licensing issues though.
          Hide
          Chris Beavers added a comment -

          Haha, a buddy of mine and I just wrote a shell supporting (for now) just pwd, ls, and cd and using JLine. This was, of course, done in the hours before I found this page. For what it's worth, the project's up at https://github.com/cbeavz/hadoosh

          Show
          Chris Beavers added a comment - Haha, a buddy of mine and I just wrote a shell supporting (for now) just pwd, ls, and cd and using JLine. This was, of course, done in the hours before I found this page. For what it's worth, the project's up at https://github.com/cbeavz/hadoosh
          Hide
          Eli Collins added a comment -

          There's an interactive Hadoop shell being developed here: https://github.com/SpringSource/impala

          Show
          Eli Collins added a comment - There's an interactive Hadoop shell being developed here: https://github.com/SpringSource/impala
          Hide
          Dheeraj Kapur added a comment -

          working on a interactive shell with bash like scripting support. Its in early stages as of now.

          hshell> cd /user/dheerajk;pwd; ls -l
          /user/dheerajk

          /user/dheerajk :
          drwx------ - dheerajk users 0 2012-03-20 18:00 /user/dheerajk/.Trash
          drwx------ - dheerajk users 0 2012-02-23 10:30 /user/dheerajk/data
          rw------ 3 dheerajk users 0 2012-03-20 11:27 /user/dheerajk/test
          drwx------ - dheerajk users 0 2012-03-21 06:51 /user/dheerajk/test1

          hshell> pwd; cd ../C49/../user/dheerajk;pwd
          /user/
          /user/dheerajk

          hshell> find . -depth 7 -name pow.*
          /user/dheerajk/data/pow

          Show
          Dheeraj Kapur added a comment - working on a interactive shell with bash like scripting support. Its in early stages as of now. hshell> cd /user/dheerajk;pwd; ls -l /user/dheerajk /user/dheerajk : drwx------ - dheerajk users 0 2012-03-20 18:00 /user/dheerajk/.Trash drwx------ - dheerajk users 0 2012-02-23 10:30 /user/dheerajk/data rw ------ 3 dheerajk users 0 2012-03-20 11:27 /user/dheerajk/test drwx------ - dheerajk users 0 2012-03-21 06:51 /user/dheerajk/test1 hshell> pwd; cd ../C49/../user/dheerajk;pwd /user/ /user/dheerajk hshell> find . -depth 7 -name pow.* /user/dheerajk/data/pow
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12475642/HADOOP-6541.6.patch
          against trunk revision 1094750.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 1071 javac compiler warnings (more than the trunk's current 1070 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/366//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/366//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/366//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12475642/HADOOP-6541.6.patch against trunk revision 1094750. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 1071 javac compiler warnings (more than the trunk's current 1070 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/366//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/366//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/366//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Here are some comments from the user perspective (didn't look at code yet)

          • should have some kind of help command - either "help", or "?"
          • I think "hadoop fs -shell" makes more sense than "hadoop shell", since this is specifically for interacting with filesystems
          • The following command threw a RuntimeException and exited:
            hadoop> cd hdfs://monster01.sf.cloudera.com:8020
            org.apache.hadoop.HadoopIllegalArgumentException: Unsupported name: has scheme but relative path-part
                    at org.apache.hadoop.fs.FileContext.checkNotSchemeWithRelative(FileContext.java:264)
                    at org.apache.hadoop.fs.FileContext.setWorkingDirectory(FileContext.java:458)
                    at org.apache.hadoop.shell.CmdShell.cd(CmdShell.java:446)
                    at org.apache.hadoop.shell.CmdShell.interpretCmd(CmdShell.java:693)
            
          • IOExceptions shouldn't include their full backtrace unless we're in debug mode. EG "rmr asdfasdf" currently shows the full stack trace instead of just the simple error message.
          • "cd" without a path goes to the home directory, which seems right. But, if the home directory doesn't exist, it seems to go to "/" which is unexpected behavior. I would expect a FileNotFoundException
          • "chmod", "chown" and "chgrp", "count", and "du" don't seem to be found
          • every command I type, it seems to echo the command back to me

          Future enhancements to do as some other JIRA:

          • "lcd", "lpwd" would be useful.
          • a "head" command would be very useful
          • "cd -" like in bash would be cool
          • showing the cwd in the prompt would be handy
          • tab completion would be handy
          Show
          Todd Lipcon added a comment - Here are some comments from the user perspective (didn't look at code yet) should have some kind of help command - either "help", or "?" I think "hadoop fs -shell" makes more sense than "hadoop shell", since this is specifically for interacting with filesystems The following command threw a RuntimeException and exited: hadoop> cd hdfs: //monster01.sf.cloudera.com:8020 org.apache.hadoop.HadoopIllegalArgumentException: Unsupported name: has scheme but relative path-part at org.apache.hadoop.fs.FileContext.checkNotSchemeWithRelative(FileContext.java:264) at org.apache.hadoop.fs.FileContext.setWorkingDirectory(FileContext.java:458) at org.apache.hadoop.shell.CmdShell.cd(CmdShell.java:446) at org.apache.hadoop.shell.CmdShell.interpretCmd(CmdShell.java:693) IOExceptions shouldn't include their full backtrace unless we're in debug mode. EG "rmr asdfasdf" currently shows the full stack trace instead of just the simple error message. "cd" without a path goes to the home directory, which seems right. But, if the home directory doesn't exist, it seems to go to "/" which is unexpected behavior. I would expect a FileNotFoundException "chmod", "chown" and "chgrp", "count", and "du" don't seem to be found every command I type, it seems to echo the command back to me Future enhancements to do as some other JIRA: "lcd", "lpwd" would be useful. a "head" command would be very useful "cd -" like in bash would be cool showing the cwd in the prompt would be handy tab completion would be handy
          Hide
          Todd Lipcon added a comment -

          Looks like HADOOP-6541.5.patch is a -p1 patch instead of -p0. Uploading a p0 copy of the same patch so that QA bot can run.

          Show
          Todd Lipcon added a comment - Looks like HADOOP-6541 .5.patch is a -p1 patch instead of -p0. Uploading a p0 copy of the same patch so that QA bot can run.
          Hide
          Aaron Kimball added a comment -

          Finally had time to get to this. Patch #5 resync'd with trunk.

          Show
          Aaron Kimball added a comment - Finally had time to get to this. Patch #5 resync'd with trunk.
          Hide
          Aaron Kimball added a comment -

          I'll take a look when I have some time soon

          Show
          Aaron Kimball added a comment - I'll take a look when I have some time soon
          Hide
          Todd Lipcon added a comment -

          Hey Aaron. I've always liked this feature and it would be cool to finally get it in. Would you mind updating the patch for trunk?

          Show
          Todd Lipcon added a comment - Hey Aaron. I've always liked this feature and it would be cool to finally get it in. Would you mind updating the patch for trunk?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12435198/HADOOP-6541.4.patch
          against trunk revision 1071364.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/279//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435198/HADOOP-6541.4.patch against trunk revision 1071364. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/279//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12435198/HADOOP-6541.4.patch
          against trunk revision 1031422.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/62//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435198/HADOOP-6541.4.patch against trunk revision 1031422. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/62//console This message is automatically generated.
          Hide
          Jeff Hammerbacher added a comment -

          [Hmm. I wonder how much duplication there is with grunt.]

          We all asked this question. Many users only have Common + HDFS + MR, and it was a Hackathon, and the feature was deemed useful enough by all involved to move forward.

          It would be nice to have a general shell in Common, or as a separate Apache project; in fact, Carl Steinbach proposed such a thing over at https://issues.apache.org/jira/browse/HIVE-987.

          Show
          Jeff Hammerbacher added a comment - [Hmm. I wonder how much duplication there is with grunt.] We all asked this question. Many users only have Common + HDFS + MR, and it was a Hackathon, and the feature was deemed useful enough by all involved to move forward. It would be nice to have a general shell in Common, or as a separate Apache project; in fact, Carl Steinbach proposed such a thing over at https://issues.apache.org/jira/browse/HIVE-987 .
          Hide
          Allen Wittenauer added a comment -

          OK, Thanks.

          [Hmm. I wonder how much duplication there is with grunt.]

          Show
          Allen Wittenauer added a comment - OK, Thanks. [Hmm. I wonder how much duplication there is with grunt.]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12435198/HADOOP-6541.4.patch
          against trunk revision 907549.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435198/HADOOP-6541.4.patch against trunk revision 907549. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/343/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          I think the -1 javadoc warnings was based on javadoc errors in FsShell. New patch that corrects these.

          Show
          Aaron Kimball added a comment - I think the -1 javadoc warnings was based on javadoc errors in FsShell. New patch that corrects these.
          Hide
          Aaron Kimball added a comment -

          Allen,

          You can indeed shell escape with ! command. No notion of a local working dir, unfortunately. (Next task: lcd and lls commands!)

          Currently you can't launch a Hadoop job within this shell. I think that's another patch / separate issue – dealing with the classloader for a jar has enough subtleties associated with it that I punted for the first version. Eventually this shell should be able to do all the hadoop jar, hadoop job, mradmin/dfsadmin etc commands too, but there's a lot of functionality to implement here. All separate tasks in my mind.

          In order to establish this code base, my first cut of this system is just focused on implementing filesystem access.

          You're right that 'rmr' is a hack. A better argument parsing system is probably necessary moving forward to support things like 'rm -[rf]', 'test', etc. Right now all arguments are treated as filenames in commands like rm; supporting various flags would require using CommandLineParser and other fun. Not impossible, but not first-cut material. Also, jline's tab-completor makes it a bit tricky to handle command-specific -arguments. I'll need to poke around more to figure out if/how that can be done.

          As for cd taking a URL – technically it already does. But the current design initializes a FileContext and uses its setWorkingDirectory() method to move around, and then uses FsShell to do some operations like deletes (which require globbing). FsShell doesn't allow reinitialization of the FileSystem, so this code checks for changes in the default FS and reinitializes the structures it uses. We should probably refactor FsShell soon to allow it to work on an arbitrary FileContext instead of the one it instantiates at construction time (or instantiate a new FsShell for each operation? That seems unnecessary). But again, refactoring FsShell I think is beyond the scope of this issue.

          All of these are great ideas but I think that they deserve separate tasks. There are enough discussion points as to how to implement each of these that I think a single thread would be confusing. Also getting this committed first would allow different people to attack these problems in parallel if they'd like. I'll file these follow-up tasks after this is resolved. Of course, feel free to file your own in the meantime if you'd like

          Thanks for the suggestions!

          Show
          Aaron Kimball added a comment - Allen, You can indeed shell escape with ! command . No notion of a local working dir, unfortunately. (Next task: lcd and lls commands!) Currently you can't launch a Hadoop job within this shell. I think that's another patch / separate issue – dealing with the classloader for a jar has enough subtleties associated with it that I punted for the first version. Eventually this shell should be able to do all the hadoop jar , hadoop job , mradmin/dfsadmin etc commands too, but there's a lot of functionality to implement here. All separate tasks in my mind. In order to establish this code base, my first cut of this system is just focused on implementing filesystem access. You're right that 'rmr' is a hack. A better argument parsing system is probably necessary moving forward to support things like 'rm - [rf] ', 'test', etc. Right now all arguments are treated as filenames in commands like rm; supporting various flags would require using CommandLineParser and other fun. Not impossible, but not first-cut material. Also, jline's tab-completor makes it a bit tricky to handle command-specific -arguments . I'll need to poke around more to figure out if/how that can be done. As for cd taking a URL – technically it already does. But the current design initializes a FileContext and uses its setWorkingDirectory() method to move around, and then uses FsShell to do some operations like deletes (which require globbing). FsShell doesn't allow reinitialization of the FileSystem, so this code checks for changes in the default FS and reinitializes the structures it uses. We should probably refactor FsShell soon to allow it to work on an arbitrary FileContext instead of the one it instantiates at construction time (or instantiate a new FsShell for each operation? That seems unnecessary). But again, refactoring FsShell I think is beyond the scope of this issue. All of these are great ideas but I think that they deserve separate tasks. There are enough discussion points as to how to implement each of these that I think a single thread would be confusing. Also getting this committed first would allow different people to attack these problems in parallel if they'd like. I'll file these follow-up tasks after this is resolved. Of course, feel free to file your own in the meantime if you'd like Thanks for the suggestions!
          Hide
          Allen Wittenauer added a comment -

          Cool. It looks like it has a shell escape.

          How does one launch a job from within the shell?

          Should rm be smart enough to take r and f as parameters? [The fact that rmr is a separate command has always been a mistake in my mind.]

          Should there be a test operator? This would be especially useful for -f functionality.

          set fs.defaultFS=hdfs://nn.example.com/ — feels like shades of VMS. Any reason cd couldn't take a URL?

          Show
          Allen Wittenauer added a comment - Cool. It looks like it has a shell escape. How does one launch a job from within the shell? Should rm be smart enough to take r and f as parameters? [The fact that rmr is a separate command has always been a mistake in my mind.] Should there be a test operator? This would be especially useful for -f functionality. set fs.defaultFS=hdfs://nn.example.com/ — feels like shades of VMS. Any reason cd couldn't take a URL?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12435049/HADOOP-6541.3.patch
          against trunk revision 906388.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435049/HADOOP-6541.3.patch against trunk revision 906388. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/338/console This message is automatically generated.
          Hide
          Aaron Kimball added a comment -

          new patch to nail down some findbugs warnings.

          Show
          Aaron Kimball added a comment - new patch to nail down some findbugs warnings.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12435043/HADOOP-6541.2.patch
          against trunk revision 906388.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 5 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12435043/HADOOP-6541.2.patch against trunk revision 906388. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/336/console This message is automatically generated.
          Hide
          Jakob Homan added a comment -

          Haven't looked at the code, but wanted to amend my previous comment to include the caveat that whatever changes are made cannot in any way change the behavior of the current commands, lest we break backwards compatibility. I don't see why this would happen, but such a restriction would help ease any not-yet-expressed resistance.

          Show
          Jakob Homan added a comment - Haven't looked at the code, but wanted to amend my previous comment to include the caveat that whatever changes are made cannot in any way change the behavior of the current commands, lest we break backwards compatibility. I don't see why this would happen, but such a restriction would help ease any not-yet-expressed resistance.
          Hide
          Aaron Kimball added a comment -

          Here's a new patch that puts this code in org.apache.hadoop.shell. Running 'bin/hadoop shell' will get you to the prompt.

          Show
          Aaron Kimball added a comment - Here's a new patch that puts this code in org.apache.hadoop.shell . Running ' bin/hadoop shell ' will get you to the prompt.
          Hide
          Aaron Kimball added a comment -

          Jakob,

          I'm happy to put this in mainline; I just figured that contrib might be a path of less resistance but if there's momentum for it, I'll put together a patch that integrates it a bit more closely, and post that in this module's place.

          There's relatively little (if any?) code duplication though. This actually calls methods of FsShell to do operations like move, rm, ls, etc, which already handle globbing on their own. The vast majority of code in this module is specific to interacting with the user, parsing arguments, tab completion, maintaining 'exit status' from commands, etc.

          FsShell does deserve a pretty broad refactoring, but I think that's out-of-scope for this issue.

          Show
          Aaron Kimball added a comment - Jakob, I'm happy to put this in mainline; I just figured that contrib might be a path of less resistance but if there's momentum for it, I'll put together a patch that integrates it a bit more closely, and post that in this module's place. There's relatively little (if any?) code duplication though. This actually calls methods of FsShell to do operations like move, rm, ls, etc, which already handle globbing on their own. The vast majority of code in this module is specific to interacting with the user, parsing arguments, tab completion, maintaining 'exit status' from commands, etc. FsShell does deserve a pretty broad refactoring, but I think that's out-of-scope for this issue.
          Hide
          Allen Wittenauer added a comment -

          +1 on stopping the contrib madness for things that should just be in mainline (i'm looking at you, scheduler people)

          Show
          Allen Wittenauer added a comment - +1 on stopping the contrib madness for things that should just be in mainline (i'm looking at you, scheduler people)
          Hide
          Jakob Homan added a comment -

          I really like this idea and jline would be a good way to do it, but I'm not sure a separate contrib module is the way to go. FsShell is definitely due for a refactoring/improvement, and this might be the opportunity to do it. Aaron's comments re: speed issues are correct. In addition, there's no getting around that our current command line tools are rather clunky and could be improved within a shell context. This is a pain point for our users, both new and experienced. But since this is true, why not go all the way and improve FsShell with these features rather than creating a contrib module that has some code duplication and will have to be maintained separately?

          Show
          Jakob Homan added a comment - I really like this idea and jline would be a good way to do it, but I'm not sure a separate contrib module is the way to go. FsShell is definitely due for a refactoring/improvement, and this might be the opportunity to do it. Aaron's comments re: speed issues are correct. In addition, there's no getting around that our current command line tools are rather clunky and could be improved within a shell context. This is a pain point for our users, both new and experienced. But since this is true, why not go all the way and improve FsShell with these features rather than creating a contrib module that has some code duplication and will have to be maintained separately?
          Hide
          Todd Lipcon added a comment -

          +1 on the idea, though have not looked at the code. I saw it in action on Aaron's computer and it seems very useful.

          Show
          Todd Lipcon added a comment - +1 on the idea, though have not looked at the code. I saw it in action on Aaron's computer and it seems very useful.
          Hide
          Aaron Kimball added a comment -

          Alex Loddengaard and I wrote a simple command shell that uses jline (a readline implementation) to create a simple shell that includes some handy features like tab completion for HDFS. This supports access to the various FsShell commands in an interactive fashion.

          There's a definite need for something like this - executing several hadoop fs ... commands in a bash script or in a regular interactive shell is very time-consuming due to the overhead of starting Java for each such command. Putting all the commands into a single JVM instance is a major win.

          When using Hadoop from outside of Java, programs that wish to interact with
          the DFS may need to run several commands through the hadoop fs ... interface, which is
          very slow.

          This shell:

          • Supports interactive use of HDFS (with a reasonable notion of a "current directory", etc)
          • Supports executing scripts of several commands

          External programs could write a script of several DFS operations and batch them up for execution in a single Java process.

          This changes some of the methods (e.g., ls()) from FsShell to be public instead of package-public so that their code can be reused here.

          We tested CmdShell by running the various commands locally, as well as testing short scripts of the commands included together. Some basic functionality unit tests are also included in this patch.

          If a committer could please re-open this issue and take a look at the attached code, I'd appreciate it.

          Show
          Aaron Kimball added a comment - Alex Loddengaard and I wrote a simple command shell that uses jline (a readline implementation) to create a simple shell that includes some handy features like tab completion for HDFS. This supports access to the various FsShell commands in an interactive fashion. There's a definite need for something like this - executing several hadoop fs ... commands in a bash script or in a regular interactive shell is very time-consuming due to the overhead of starting Java for each such command. Putting all the commands into a single JVM instance is a major win. When using Hadoop from outside of Java, programs that wish to interact with the DFS may need to run several commands through the hadoop fs ... interface, which is very slow. This shell: Supports interactive use of HDFS (with a reasonable notion of a "current directory", etc) Supports executing scripts of several commands External programs could write a script of several DFS operations and batch them up for execution in a single Java process. This changes some of the methods (e.g., ls()) from FsShell to be public instead of package-public so that their code can be reused here. We tested CmdShell by running the various commands locally, as well as testing short scripts of the commands included together. Some basic functionality unit tests are also included in this patch. If a committer could please re-open this issue and take a look at the attached code, I'd appreciate it.

            People

            • Assignee:
              Unassigned
              Reporter:
              Aaron Kimball
            • Votes:
              1 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:

                Development