Flume
  1. Flume
  2. FLUME-1661

ExecSource cannot execute complex *nix commands

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v1.2.0
    • Fix Version/s: v1.4.0
    • Component/s: Sinks+Sources
    • Labels:
      None
    • Release Note:
      Added new optional config directive 'shell' for Exec source. This enables support for commands that use wildcards, backticks, pipes and other such shell features.

      Description

      • command line parsing
        • conf/flume.conf
          agent.sources.source1.type = exec
          agent.sources.source1.command = tail -f /some/path/logs/exception/error.log.`date +%Y%m%d%H`
          
        • result
          tail: /some/path/logs/exception/error.log.`date: No such file or directory
          tail: +%Y%m%d%H`: No such file or directory
          
        • needs to be improved
          (ExecSouce.java:242) String[] commandArgs = command.split("\\s+") 
          
      • using special character (e.g. *, `, ', ...)
        • conf/flume.conf
          agent.sources.source1.type = exec
          agent.sources.source1.command = tail -f /some/path/logs/exception/error.log.*
          
        • result
          tail: /some/path/logs/exception/error.log.*: No such file or directory
          
        • needs to be improved
          (ExecSouce.java:243) process = new ProcessBuilder(commandArgs).start();
          
      1. FLUME-1661-1.patch
        0.7 kB
        Daisuke Kobayashi
      2. FLUME-1661.patch
        12 kB
        Roshan Naik
      3. FLUME-1661.patch.v2
        12 kB
        Roshan Naik
      4. FLUME-1661.patch.v3
        14 kB
        Roshan Naik
      5. FlumeProcessRunner.tar.gz
        3 kB
        Nitin Verma
      6. FLUME-1661.patch.v4
        14 kB
        Roshan Naik
      7. 1661.patch.v5
        14 kB
        Roshan Naik
      8. 1661.patch.v6
        14 kB
        Roshan Naik
      9. 1661.patch.v7
        15 kB
        Roshan Naik
      10. 1661.patch.v8
        15 kB
        Roshan Naik
      11. 1661.patch.v9
        15 kB
        Roshan Naik

        Issue Links

          Activity

          Hide
          Hari Shreedharan added a comment - - edited

          The problem here is that the parsing of arguments in the exec source is not very smart. The work around I usually use is to create a wrapper shell script around the original command(s) and then call the script from exec source.

          Show
          Hari Shreedharan added a comment - - edited The problem here is that the parsing of arguments in the exec source is not very smart. The work around I usually use is to create a wrapper shell script around the original command(s) and then call the script from exec source.
          Hide
          Brock Noland added a comment -

          Yeah, as opposed to implementing bash word splitting I think users should just create a shell script wrapper.

          Show
          Brock Noland added a comment - Yeah, as opposed to implementing bash word splitting I think users should just create a shell script wrapper.
          Hide
          Daisuke Kobayashi added a comment -

          Uploaded a patch. I think it's able to handle any complicated *nix commands, but can you review if this could be acceptable?

          Show
          Daisuke Kobayashi added a comment - Uploaded a patch. I think it's able to handle any complicated *nix commands, but can you review if this could be acceptable?
          Hide
          Brock Noland added a comment -

          If were are going to do something like this, I think we should add a number of tests to make sure it works as expected.

          Show
          Brock Noland added a comment - If were are going to do something like this, I think we should add a number of tests to make sure it works as expected.
          Hide
          Roshan Naik added a comment -

          +1. It is definitely nice to be able to specify the full command directly without the need for wrappers.

          Show
          Roshan Naik added a comment - +1. It is definitely nice to be able to specify the full command directly without the need for wrappers.
          Hide
          Brock Noland added a comment - - edited

          I think we should do the following:

          1) Write the command to file
          2) Execute the file with bash (without the -c option as there is a race condition with writing to a file and then executing it when you have other possible threads forking and as such having that file descriptor open)
          3) Test to make sure this works, unit tests using echo `date +%Y%m%d%H` should be all that is needed

          Show
          Brock Noland added a comment - - edited I think we should do the following: 1) Write the command to file 2) Execute the file with bash (without the -c option as there is a race condition with writing to a file and then executing it when you have other possible threads forking and as such having that file descriptor open) 3) Test to make sure this works, unit tests using echo `date +%Y%m%d%H` should be all that is needed
          Hide
          Jarek Jarcec Cecho added a comment -

          Quick question:

          Is it really needed to hard code "/bin/sh -c $(content of command property)" into ExecSource (and thus limit it's options at least on theoretical level) when user can actually specified that in the configuration file when needed? E.g. doing something like:

          agent.sources.source1.type = exec
          agent.sources.source1.command = /bin/sh -c tail -f /some/path/logs/exception/error.log.`date +%Y%m%d%H`
          

          (Please note that I did not test it myself, I'm just thinking)

          Jarcec

          Show
          Jarek Jarcec Cecho added a comment - Quick question: Is it really needed to hard code "/bin/sh -c $(content of command property)" into ExecSource (and thus limit it's options at least on theoretical level) when user can actually specified that in the configuration file when needed? E.g. doing something like: agent.sources.source1.type = exec agent.sources.source1.command = /bin/sh -c tail -f /some/path/logs/exception/error.log.`date +%Y%m%d%H` (Please note that I did not test it myself, I'm just thinking) Jarcec
          Hide
          Roshan Naik added a comment -

          Indeed, hardcoding /bin/sh won't be a good idea. That would make it platform specific. (Windows?)

          Show
          Roshan Naik added a comment - Indeed, hardcoding /bin/sh won't be a good idea. That would make it platform specific. (Windows?)
          Hide
          Roshan Naik added a comment -

          perhaps this call for an optional 'shell' config

          agent.sources.source1.type = exec
          agent.sources.source1.shell = /bin/sh -c
          agent.sources.source1.command = tail -f /path/log.`date +%Y%m%d%H` 2>/dev/null

          This could then be translated into :
          Runtime.exec(new String[]

          { "/bin/sh", "-c", "tail -f /path/log.`date +%Y%m%d%H` 2>/dev/null" }

          );

          If the optional 'shell' config is not set, we could default to existing behavior

          Show
          Roshan Naik added a comment - perhaps this call for an optional 'shell' config agent.sources.source1.type = exec agent.sources.source1.shell = /bin/sh -c agent.sources.source1.command = tail -f /path/log.`date +%Y%m%d%H` 2>/dev/null This could then be translated into : Runtime.exec(new String[] { "/bin/sh", "-c", "tail -f /path/log.`date +%Y%m%d%H` 2>/dev/null" } ); If the optional 'shell' config is not set, we could default to existing behavior
          Hide
          Brock Noland added a comment -

          Guys, this stuff is very brittle... For example, the following tests various commands which just by looking at them, you'd think would all output the same thing. They are executed the same way we start the command in exec source:

            public static void main(String[] args) throws Exception {
              test("/bin/sh -c echo `date +%Y%m%d%H`");  
              test("echo `date +%Y%m%d%H`");    
              test("date +%Y%m%d%H");    
              test("/bin/sh -c date +%Y%m%d%H");
            }
            
            static void test(String cmd) throws Exception {
              ProcessBuilder builder = new ProcessBuilder(cmd.split("\\s+"));
              Process process = builder.start();
              BufferedReader out = new BufferedReader(
                  new InputStreamReader(process.getInputStream()));
              BufferedReader err = new BufferedReader(
                  new InputStreamReader(process.getErrorStream()));
              Thread.sleep(10L);
              System.out.println("cmd = " + cmd);
              System.out.println("out = '" + readAll(out) + "'");
              System.out.println("err = '" + readAll(err) + "'");
              System.out.println("exit = " + process.waitFor());
            }
            static String readAll(BufferedReader reader) throws IOException {
              String result = "";
              String line;
              while((line = reader.readLine()) != null) {
                result += line + "\n";
              }
              return result;
            }
          

          Yet all four output very different things:

          cmd = /bin/sh -c echo `date +%Y%m%d%H`
          out = '
          '
          err = ''
          exit = 0
          cmd = echo `date +%Y%m%d%H`
          out = '`date +%Y%m%d%H`
          '
          err = ''
          exit = 0
          cmd = date +%Y%m%d%H
          out = '2012102219
          '
          err = ''
          exit = 0
          cmd = /bin/sh -c date +%Y%m%d%H
          out = 'Mon Oct 22 19:11:59 CDT 2012
          '
          err = ''
          exit = 0
          

          I could go into why but I hope we don't have to go there. Building shell commands in java is a fools game. We can simply take whatever command is passed to us, write it to a file, and let the shell interpret it. This gives the functionality asked for without wasting ours and our users time figuring how to correctly quote commands which will be executed by process builder.

          Show
          Brock Noland added a comment - Guys, this stuff is very brittle... For example, the following tests various commands which just by looking at them, you'd think would all output the same thing. They are executed the same way we start the command in exec source: public static void main( String [] args) throws Exception { test( "/bin/sh -c echo `date +%Y%m%d%H`" ); test( "echo `date +%Y%m%d%H`" ); test( "date +%Y%m%d%H" ); test( "/bin/sh -c date +%Y%m%d%H" ); } static void test( String cmd) throws Exception { ProcessBuilder builder = new ProcessBuilder(cmd.split( "\\s+" )); Process process = builder.start(); BufferedReader out = new BufferedReader( new InputStreamReader(process.getInputStream())); BufferedReader err = new BufferedReader( new InputStreamReader(process.getErrorStream())); Thread .sleep(10L); System .out.println( "cmd = " + cmd); System .out.println( "out = '" + readAll(out) + "'" ); System .out.println( "err = '" + readAll(err) + "'" ); System .out.println( "exit = " + process.waitFor()); } static String readAll(BufferedReader reader) throws IOException { String result = ""; String line; while ((line = reader.readLine()) != null ) { result += line + "\n" ; } return result; } Yet all four output very different things: cmd = /bin/sh -c echo `date +%Y%m%d%H` out = ' ' err = '' exit = 0 cmd = echo `date +%Y%m%d%H` out = '`date +%Y%m%d%H` ' err = '' exit = 0 cmd = date +%Y%m%d%H out = '2012102219 ' err = '' exit = 0 cmd = /bin/sh -c date +%Y%m%d%H out = 'Mon Oct 22 19:11:59 CDT 2012 ' err = '' exit = 0 I could go into why but I hope we don't have to go there. Building shell commands in java is a fools game. We can simply take whatever command is passed to us, write it to a file, and let the shell interpret it. This gives the functionality asked for without wasting ours and our users time figuring how to correctly quote commands which will be executed by process builder.
          Hide
          Jarek Jarcec Cecho added a comment -

          Brock Noland Thank you for your test. I believe it's pretty obvious that it's not simple to do all sort of escaping that is required.

          I'm not sure that writing to a file and interpreting is a good idea as we would need writable location on local file system (/tmp/ should be always writable for everyone right?) and probably configuration to specify which shell should be used. I more like the solution proposed by Roshan Naik to have optional shell option that will change exec source behavior to execute given shell and pass entire command to it as an parameter.

          Jarcec

          Show
          Jarek Jarcec Cecho added a comment - Brock Noland Thank you for your test. I believe it's pretty obvious that it's not simple to do all sort of escaping that is required. I'm not sure that writing to a file and interpreting is a good idea as we would need writable location on local file system (/tmp/ should be always writable for everyone right?) and probably configuration to specify which shell should be used. I more like the solution proposed by Roshan Naik to have optional shell option that will change exec source behavior to execute given shell and pass entire command to it as an parameter. Jarcec
          Hide
          Brock Noland added a comment -

          Building up a command in java, by specifying a shell in the configuration file will have the same problem I describe above. I don't have a problem with writing to a file and then executing with a specific shell (defaults to bash) as specified via the configuration.

          FWWIW, the Hadoop tasktracker writes the commands to a temp file before starting a task for this same reason.

          Brock

          Show
          Brock Noland added a comment - Building up a command in java, by specifying a shell in the configuration file will have the same problem I describe above. I don't have a problem with writing to a file and then executing with a specific shell (defaults to bash) as specified via the configuration. FWWIW, the Hadoop tasktracker writes the commands to a temp file before starting a task for this same reason. Brock
          Hide
          Jarek Jarcec Cecho added a comment -

          I see your point Brock, thanks for your feedback.

          Reason why I liked the configuration option for shell more is that it would be fully backward compatible (no shell option ~ execute the command exactly as previous version) and open still open to possible windows support. But fact is that you might achieve the same with file approach as well. So I guess that I should be indifferent between those two options, but I like the shell option without tmp file little bit more as it won't require writable tmp directory

          Jarcec

          Show
          Jarek Jarcec Cecho added a comment - I see your point Brock, thanks for your feedback. Reason why I liked the configuration option for shell more is that it would be fully backward compatible (no shell option ~ execute the command exactly as previous version) and open still open to possible windows support. But fact is that you might achieve the same with file approach as well. So I guess that I should be indifferent between those two options, but I like the shell option without tmp file little bit more as it won't require writable tmp directory Jarcec
          Hide
          Brock Noland added a comment -

          I don't think that writing the command to a file and then executing that file via bash is going to cause problems in terms of backwards compatibility.

          Show
          Brock Noland added a comment - I don't think that writing the command to a file and then executing that file via bash is going to cause problems in terms of backwards compatibility.
          Hide
          Jarek Jarcec Cecho added a comment -

          Well I can imagine that it might break backward compatibility on systems where bash is not present or where it's not simple to get writable tmp directory. But I guess that those are pathological cases that we do not have to take into consideration.

          Jarcec

          Show
          Jarek Jarcec Cecho added a comment - Well I can imagine that it might break backward compatibility on systems where bash is not present or where it's not simple to get writable tmp directory. But I guess that those are pathological cases that we do not have to take into consideration. Jarcec
          Hide
          Brock Noland added a comment -

          Yeah I don't think a system without a working: File.createTempFile() would be able to run much these days.

          Sorry for the confusion, I said bash but I meant we should use /bin/sh which is a symlink to bash on the vast majority of systems.

          Show
          Brock Noland added a comment - Yeah I don't think a system without a working: File.createTempFile() would be able to run much these days. Sorry for the confusion, I said bash but I meant we should use /bin/sh which is a symlink to bash on the vast majority of systems.
          Hide
          Roshan Naik added a comment - - edited

          Brock the situation is not as tricky. Try the below code. Here test() works for all commands (if 'shell' config is specified). test2() works for simple commands (i.e current use cases). The assumption is that the "/bin/sh -c" comes from the 'shell' config.

              // invoke complex commands through shell  - when the 'shell' config is specified
              static void test(String cmd) throws Exception { 
                  Process process = Runtime.getRuntime().exec(new String[] { "/bin/sh", "-c", cmd });
          
                  BufferedReader out = new BufferedReader(
                          new InputStreamReader(process.getInputStream()));
                  Thread.sleep(10L);
                  System.out.println("cmd = " + cmd);
                  System.out.println("out = '" + readAll(out) + "'");
              }
          
              // invoke simple commands directly  - when the 'shell' config is NOT specified
              static void test2(String cmd) throws Exception {
                  Process process = Runtime.getRuntime().exec(cmd);  //we can also use ProcesBuilder
          
                  BufferedReader out = new BufferedReader(
                          new InputStreamReader(process.getInputStream()));
                  Thread.sleep(10L);
                  System.out.println("out2 = '" + readAll(out) + "'");
                  System.out.println();
              }
          
              static String readAll(BufferedReader reader) throws IOException {
                  String result = "";
                  String line;
                  while((line = reader.readLine()) != null) {
                      result += line + "\n";
                  }
                  return result;
              }
          
          Show
          Roshan Naik added a comment - - edited Brock the situation is not as tricky. Try the below code. Here test() works for all commands (if 'shell' config is specified). test2() works for simple commands (i.e current use cases). The assumption is that the "/bin/sh -c" comes from the 'shell' config. // invoke complex commands through shell - when the 'shell' config is specified static void test( String cmd) throws Exception { Process process = Runtime .getRuntime().exec( new String [] { "/bin/sh" , "-c" , cmd }); BufferedReader out = new BufferedReader( new InputStreamReader(process.getInputStream())); Thread .sleep(10L); System .out.println( "cmd = " + cmd); System .out.println( "out = '" + readAll(out) + "'" ); } // invoke simple commands directly - when the 'shell' config is NOT specified static void test2( String cmd) throws Exception { Process process = Runtime .getRuntime().exec(cmd); //we can also use ProcesBuilder BufferedReader out = new BufferedReader( new InputStreamReader(process.getInputStream())); Thread .sleep(10L); System .out.println( "out2 = '" + readAll(out) + "'" ); System .out.println(); } static String readAll(BufferedReader reader) throws IOException { String result = ""; String line; while ((line = reader.readLine()) != null ) { result += line + "\n" ; } return result; }
          Hide
          Roshan Naik added a comment -

          Forgot the main() ....

              public static void main(String[] args) throws Exception {
            
                  test("echo `date +%Y%m%d%H`");
                  test2("echo `date +%Y%m%d%H`");
                  System.out.println("---------------------------");
          
                  test("date +%Y%m%d%H");
                  test2("date +%Y%m%d%H");
                  System.out.println("---------------------------");
          
                  test("echo *");
                  test2("echo *");
                  System.out.println("---------------------------");
          
                  test("date +%Y%m%d%H");
                  test2("date +%Y%m%d%H");
                  System.out.println("---------------------------");
          
                  test("ls /tmp");
                  test2("ls /tmp");
          
              }
          
          

          If we are ok with this... I can go ahead and provide a patch based on this.

          Show
          Roshan Naik added a comment - Forgot the main() .... public static void main( String [] args) throws Exception { test( "echo `date +%Y%m%d%H`" ); test2( "echo `date +%Y%m%d%H`" ); System .out.println( "---------------------------" ); test( "date +%Y%m%d%H" ); test2( "date +%Y%m%d%H" ); System .out.println( "---------------------------" ); test( "echo *" ); test2( "echo *" ); System .out.println( "---------------------------" ); test( "date +%Y%m%d%H" ); test2( "date +%Y%m%d%H" ); System .out.println( "---------------------------" ); test( "ls /tmp" ); test2( "ls /tmp" ); } If we are ok with this... I can go ahead and provide a patch based on this.
          Hide
          Brock Noland added a comment - - edited

          Messing around with sub-processes is a well know source of bugs. As such, if we make this change there should be plenty of tests. I'd like to see at least the following in the test suite:

          date +%Y%m%d
          echo `date +%Y%m%d`
          for i in

          {0..5}

          ; do echo $i;done

          Show
          Brock Noland added a comment - - edited Messing around with sub-processes is a well know source of bugs. As such, if we make this change there should be plenty of tests. I'd like to see at least the following in the test suite: date +%Y%m%d echo `date +%Y%m%d` for i in {0..5} ; do echo $i;done
          Hide
          Roshan Naik added a comment -

          Added new config 'shell' for Exec Source. Patch includes Code, Tests and Documentation.

          Show
          Roshan Naik added a comment - Added new config 'shell' for Exec Source. Patch includes Code, Tests and Documentation.
          Hide
          Roshan Naik added a comment -

          This patch could potentially go into 1.3.

          Existing tests for ExecSource needed some minor refactoring to allow repeated changes to configuration(i.e command) and restarting of source.

          Includes test+code+doc changes

          https://reviews.apache.org/r/7748/

          Show
          Roshan Naik added a comment - This patch could potentially go into 1.3. Existing tests for ExecSource needed some minor refactoring to allow repeated changes to configuration(i.e command) and restarting of source. Includes test+code+doc changes https://reviews.apache.org/r/7748/
          Hide
          Jarek Jarcec Cecho added a comment -

          Linked review board is giving me "Permission denied". Roshan Naik], could you please verify that you've hit the "publish" button or that group "flume" is properly set?

          Jarcec

          Show
          Jarek Jarcec Cecho added a comment - Linked review board is giving me "Permission denied". Roshan Naik ], could you please verify that you've hit the "publish" button or that group "flume" is properly set? Jarcec
          Hide
          Roshan Naik added a comment -

          Have published the review. Please try again.

          Show
          Roshan Naik added a comment - Have published the review. Please try again.
          Hide
          Roshan Naik added a comment -

          Added more info in the user guide for the new 'shell' config in Exec source.

          Show
          Roshan Naik added a comment - Added more info in the user guide for the new 'shell' config in Exec source.
          Hide
          Mike Percy added a comment -

          Having been out of town for the past week I am just catching up, so sorry for not chiming in earlier.

          To be frank, I don't see the point of making exec source parse shell commands. Exec source should be one of the simplest sources we have, so please tell me what this added complexity buys us in terms of functionality. Wrapper scripts are common, easy, and well behaved.

          Show
          Mike Percy added a comment - Having been out of town for the past week I am just catching up, so sorry for not chiming in earlier. To be frank, I don't see the point of making exec source parse shell commands. Exec source should be one of the simplest sources we have, so please tell me what this added complexity buys us in terms of functionality. Wrapper scripts are common, easy, and well behaved.
          Hide
          Roshan Naik added a comment -

          Simplicity for users. It is nicer not to require yet another file to deployed to all flume agents for simple use cases.

          Deploying another file is reasonable when the command is going to be a multiline script.. but requiring it just because it is using some simple features like wildcards for instance, seems excessive.

          Show
          Roshan Naik added a comment - Simplicity for users. It is nicer not to require yet another file to deployed to all flume agents for simple use cases. Deploying another file is reasonable when the command is going to be a multiline script.. but requiring it just because it is using some simple features like wildcards for instance, seems excessive.
          Hide
          Mike Percy added a comment -

          OK, that's fair.

          One additional (non-technical) concern that have is related to Apache etiquette. An initial patch was submitted by Daisuke on October 20, and while some feedback was provided it seems that a new patch was submitted and the JIRA was assigned to someone else without consulting him.

          Daisuke Kobayashi: Did you want to keep working on this JIRA? Or would you prefer that someone else take over?

          Show
          Mike Percy added a comment - OK, that's fair. One additional (non-technical) concern that have is related to Apache etiquette. An initial patch was submitted by Daisuke on October 20, and while some feedback was provided it seems that a new patch was submitted and the JIRA was assigned to someone else without consulting him. Daisuke Kobayashi : Did you want to keep working on this JIRA? Or would you prefer that someone else take over?
          Hide
          Roshan Naik added a comment -

          yep .. indeed did not mean to violate etiquette when I asked if I could submit a patch.

          Show
          Roshan Naik added a comment - yep .. indeed did not mean to violate etiquette when I asked if I could submit a patch.
          Hide
          Mike Percy added a comment -

          Totally understood. I haven't heard a response from Daisuke so I assume he has lost interest in this ticket. Just wanted to make sure.

          Show
          Mike Percy added a comment - Totally understood. I haven't heard a response from Daisuke so I assume he has lost interest in this ticket. Just wanted to make sure.
          Hide
          Mike Percy added a comment -

          (adding Roshan's Review Board URL as a ticket link)

          Show
          Mike Percy added a comment - (adding Roshan's Review Board URL as a ticket link)
          Hide
          Daisuke Kobayashi added a comment -

          Actually I want to working on this JIRA, but I cannot catch up what's going on right now...
          Please don't care about taking over this JIRA.

          Show
          Daisuke Kobayashi added a comment - Actually I want to working on this JIRA, but I cannot catch up what's going on right now... Please don't care about taking over this JIRA.
          Hide
          Nitin Verma added a comment - - edited

          This is my thinking, we should invoke a shell not the command and let the shell do the parsing.
          We can call this as ShellExecSource, with user's choice of the shell.

          Following is an example of /bin/sh that tried with a command: "ls -la ; pwd \n `which cat` /etc/passwd >&2"
          and it works.

                    //String[] commandArgs = command.split("\\s+"); // let user-command be as is.
                    process = new ProcessBuilder("/bin/sh").start();
          
                    final OutputStream outputStream = process.getOutputStream();
                    outputStream.write(command.getBytes());
                    outputStream.write('\n');
                    outputStream.flush();
          
                    outputStream.close();
          

          This way we do not have to worry about all the shells

          Bourne shell (sh)
          Almquist shell (ash)
          Bourne-Again shell (bash)
          Debian Almquist shell (dash)
          Public domain Korn shell (pdksh)
          MirBSD Korn shell (mksh)
          Z shell (zsh)
          C shell (csh)
          TENEX C shell (tcsh)
          etc...

          Show
          Nitin Verma added a comment - - edited This is my thinking, we should invoke a shell not the command and let the shell do the parsing. We can call this as ShellExecSource, with user's choice of the shell. Following is an example of /bin/sh that tried with a command: "ls -la ; pwd \n `which cat` /etc/passwd >&2" and it works. // String [] commandArgs = command.split( "\\s+" ); // let user-command be as is. process = new ProcessBuilder( "/bin/sh" ).start(); final OutputStream outputStream = process.getOutputStream(); outputStream.write(command.getBytes()); outputStream.write('\n'); outputStream.flush(); outputStream.close(); This way we do not have to worry about all the shells Bourne shell (sh) Almquist shell (ash) Bourne-Again shell (bash) Debian Almquist shell (dash) Public domain Korn shell (pdksh) MirBSD Korn shell (mksh) Z shell (zsh) C shell (csh) TENEX C shell (tcsh) etc...
          Hide
          Roshan Naik added a comment -

          Nitin, thats a potential alternative but has one issue. If the command exits, the shell will not exit.

          The proposed patch works with any shell and lets the shell handle the parsing.

          Show
          Roshan Naik added a comment - Nitin, thats a potential alternative but has one issue. If the command exits, the shell will not exit. The proposed patch works with any shell and lets the shell handle the parsing.
          Hide
          Roshan Naik added a comment -

          Incorporating Mike Percy's comments for adding test case to cover commands with embedded quotes etc and to have them read from a file to simulate reading from the flume config file

          Show
          Roshan Naik added a comment - Incorporating Mike Percy's comments for adding test case to cover commands with embedded quotes etc and to have them read from a file to simulate reading from the flume config file
          Hide
          Roshan Naik added a comment -

          Revising patch v3 (changing path of input file to src/test/resources/test_command.txt).

          Show
          Roshan Naik added a comment - Revising patch v3 (changing path of input file to src/test/resources/test_command.txt).
          Hide
          Nitin Verma added a comment -

          Roshan, please try the following command

          for i in `seq 100`; do echo $i; done | awk '

          { print $1 + 1 }

          '

          Show
          Nitin Verma added a comment - Roshan, please try the following command for i in `seq 100`; do echo $i; done | awk ' { print $1 + 1 } '
          Hide
          Nitin Verma added a comment - - edited

          Please refer: http://pastebin.com/9ZHNkpum (Do see it has the code I am going to talk about.)

          I feel ExecSource should not carry too much code for process handling, for separation of concern I have created a process handling code.

          Features:-
          1) User can synchronously or asynchronously read the stdout or stderr of the process.
          For reading asynchronously user needs to create or use an implementation of InputStreamRunner interface.

          2) Exit codes are well propagated from script -> shell -> JVM

          3) With use of shell's stdin any shell command or even scripts can be run.

          4) FlumeProcessExecutor can be used for ExecSource and ShellExecSource.

          I believe this code belongs to flume-ng-sdk and be called from flume-ng-core.

          Test cases:-

          Following are some tests I did.

          1) To test exit code propagation by exiting from inside a shell.

          2) To test sequence of command on /bin/sh
          perl -e 'print \"\tHello perl!!!\n\"' ;
          mkdir -p /tmp/.flume-unit-test ; cd /tmp/.flume-unit-test ; ls -la >&2 ;
          cat /etc/passwd | awk -F: 'BEGIN

          {print \"Hello AWK\"} {print $NF}' | sort | uniq -c >&2 ;
          exit 101

          3) Tested another flavour of shell /bin/tcsh
          package edu.nitin.testcodes;
          
          import java.io.IOException;
          import org.testng.Assert;
          import org.testng.annotations.Test;
          
          public class FlumeProcessExecutorTest {
          
              @Test
              public void testProcess() throws ProcessExecutionException, IOException, InterruptedException {
          
                  // Exit code test with /bin/sh
                  {
                      System.out.println("Running...");
                      final int exitWith = 174;
                      final FlumeProcessExecutor flumeProcessExecutor =
                              new FlumeProcessExecutor("sh");
                      flumeProcessExecutor.start();
                      flumeProcessExecutor.getStandardInput().write(("exit " + exitWith).getBytes());
                      flumeProcessExecutor.getStandardInput().close();
                      int exitCode = Integer.MIN_VALUE;
                      try {
                          exitCode = flumeProcessExecutor.waitFor();
                      } catch (final ProcessExecutionException pee) {
                          pee.printStackTrace();
                          try {
                              exitCode = flumeProcessExecutor.exitCode();
                          } catch (final ProcessExecutionException pee2) {
                              pee2.printStackTrace();
                              flumeProcessExecutor.stop();
                          }
                      }
                      Assert.assertEquals(exitCode, exitWith);
                      System.out.println("...Done");
                  }
          
          
                  //invoking the shell (sh) 
                  {
                      System.out.println("Running...");
                      final FlumeProcessExecutor flumeProcessExecutor = new FlumeProcessExecutor("sh");
                      flumeProcessExecutor.start();
          
          



          /Jira java code rendering plug-in is not able to parse this section of code thus commented/
          /To be clear following line gets execute in the test case/

          // flumeProcessExecutor.getStandardInput().write(("perl -e 'print \"
          tHello perl!!!\\n\"' ; "
          // + " mkdir -p /tmp/.flume-unit-test ; cd /tmp/.flume-unit-test ; ls -la >&2 ; "
          // + " cat /etc/passwd | awk -F: 'BEGIN {print "Hello AWK"}

          {print $NF}

          ' | sort | uniq -c >&2 ;"
          // + " \n exit 101").getBytes());

                      flumeProcessExecutor.getStandardInput().close();
                      int exitCode = Integer.MIN_VALUE;
                      try {
                          exitCode = flumeProcessExecutor.waitFor();
                      } catch (final ProcessExecutionException pee) {
                          pee.printStackTrace();
                          try {
                              System.out.println("tring exitCode");
                              exitCode = flumeProcessExecutor.exitCode();
                          } catch (final ProcessExecutionException pee2) {
                              pee2.printStackTrace();
                              flumeProcessExecutor.stop();
                          }
                      }
                      System.out.println("...Done with " + exitCode);
                  }
          
                  //invoking tcsh
                  {
                      System.out.println("Running...");
                      final FlumeProcessExecutor flumeProcessExecutor =
                              new FlumeProcessExecutor("tcsh");
                      flumeProcessExecutor.start();
                      flumeProcessExecutor.getStandardInput().write("set array = (a b c) && echo $array".getBytes());
                      flumeProcessExecutor.getStandardInput().close();
                      int exitCode = Integer.MIN_VALUE;
                      try {
                          exitCode = flumeProcessExecutor.waitFor();
                      } catch (final ProcessExecutionException pee) {
                          pee.printStackTrace();
                          try {
                              System.out.println("tring exitCode");
                              exitCode = flumeProcessExecutor.exitCode();
                          } catch (final ProcessExecutionException pee2) {
                              pee2.printStackTrace();
                              flumeProcessExecutor.stop();
                          }
                      }
                      Assert.assertEquals(exitCode, 0);
                      System.out.println("...Done");
                  }
              }
          }
          

          edited because of Jira formatting issues.

          Show
          Nitin Verma added a comment - - edited Please refer: http://pastebin.com/9ZHNkpum (Do see it has the code I am going to talk about.) I feel ExecSource should not carry too much code for process handling, for separation of concern I have created a process handling code. Features:- 1) User can synchronously or asynchronously read the stdout or stderr of the process. For reading asynchronously user needs to create or use an implementation of InputStreamRunner interface. 2) Exit codes are well propagated from script -> shell -> JVM 3) With use of shell's stdin any shell command or even scripts can be run. 4) FlumeProcessExecutor can be used for ExecSource and ShellExecSource. I believe this code belongs to flume-ng-sdk and be called from flume-ng-core. Test cases:- Following are some tests I did. 1) To test exit code propagation by exiting from inside a shell. 2) To test sequence of command on /bin/sh perl -e 'print \"\tHello perl!!!\n\"' ; mkdir -p /tmp/.flume-unit-test ; cd /tmp/.flume-unit-test ; ls -la >&2 ; cat /etc/passwd | awk -F: 'BEGIN {print \"Hello AWK\"} {print $NF}' | sort | uniq -c >&2 ; exit 101 3) Tested another flavour of shell /bin/tcsh package edu.nitin.testcodes; import java.io.IOException; import org.testng.Assert; import org.testng.annotations.Test; public class FlumeProcessExecutorTest { @Test public void testProcess() throws ProcessExecutionException, IOException, InterruptedException { // Exit code test with /bin/sh { System .out.println( "Running..." ); final int exitWith = 174; final FlumeProcessExecutor flumeProcessExecutor = new FlumeProcessExecutor( "sh" ); flumeProcessExecutor.start(); flumeProcessExecutor.getStandardInput().write(( "exit " + exitWith).getBytes()); flumeProcessExecutor.getStandardInput().close(); int exitCode = Integer .MIN_VALUE; try { exitCode = flumeProcessExecutor.waitFor(); } catch ( final ProcessExecutionException pee) { pee.printStackTrace(); try { exitCode = flumeProcessExecutor.exitCode(); } catch ( final ProcessExecutionException pee2) { pee2.printStackTrace(); flumeProcessExecutor.stop(); } } Assert.assertEquals(exitCode, exitWith); System .out.println( "...Done" ); } //invoking the shell (sh) { System .out.println( "Running..." ); final FlumeProcessExecutor flumeProcessExecutor = new FlumeProcessExecutor( "sh" ); flumeProcessExecutor.start(); / Jira java code rendering plug-in is not able to parse this section of code thus commented / / To be clear following line gets execute in the test case / // flumeProcessExecutor.getStandardInput().write(("perl -e 'print \" tHello perl!!!\\n\"' ; " // + " mkdir -p /tmp/.flume-unit-test ; cd /tmp/.flume-unit-test ; ls -la >&2 ; " // + " cat /etc/passwd | awk -F: 'BEGIN {print "Hello AWK"} {print $NF} ' | sort | uniq -c >&2 ;" // + " \n exit 101").getBytes()); flumeProcessExecutor.getStandardInput().close(); int exitCode = Integer .MIN_VALUE; try { exitCode = flumeProcessExecutor.waitFor(); } catch ( final ProcessExecutionException pee) { pee.printStackTrace(); try { System .out.println( "tring exitCode" ); exitCode = flumeProcessExecutor.exitCode(); } catch ( final ProcessExecutionException pee2) { pee2.printStackTrace(); flumeProcessExecutor.stop(); } } System .out.println( "...Done with " + exitCode); } //invoking tcsh { System .out.println( "Running..." ); final FlumeProcessExecutor flumeProcessExecutor = new FlumeProcessExecutor( "tcsh" ); flumeProcessExecutor.start(); flumeProcessExecutor.getStandardInput().write( "set array = (a b c) && echo $array" .getBytes()); flumeProcessExecutor.getStandardInput().close(); int exitCode = Integer .MIN_VALUE; try { exitCode = flumeProcessExecutor.waitFor(); } catch ( final ProcessExecutionException pee) { pee.printStackTrace(); try { System .out.println( "tring exitCode" ); exitCode = flumeProcessExecutor.exitCode(); } catch ( final ProcessExecutionException pee2) { pee2.printStackTrace(); flumeProcessExecutor.stop(); } } Assert.assertEquals(exitCode, 0); System .out.println( "...Done" ); } } } edited because of Jira formatting issues.
          Hide
          Nitin Verma added a comment -

          Attaching a code tar so that you can play around.

          $ tar -ztvf FlumeProcessRunner.tar.gz
          rw-rw-r- nitin/nitin 220 2012-11-03 18:44 src/main/java/edu/nitin/testcodes/Destructible.java
          rw-rw-r- nitin/nitin 233 2012-11-03 18:45 src/main/java/edu/nitin/testcodes/DestrutableRunner.java
          rw-rw-r- nitin/nitin 8272 2012-11-03 23:43 src/main/java/edu/nitin/testcodes/FlumeProcessExecutor.java
          rw-rw-r- nitin/nitin 316 2012-11-03 19:14 src/main/java/edu/nitin/testcodes/InputStreamRunner.java
          rw-rw-r- nitin/nitin 548 2012-11-03 14:36 src/main/java/edu/nitin/testcodes/ProcessExecutionException.java
          rw-rw-r- nitin/nitin 717 2012-11-03 20:07 src/main/java/edu/nitin/testcodes/ProcessExecutor.java
          rw-rw-r- nitin/nitin 331 2012-11-02 20:15 src/test/resources/log4j.properties
          rw-rw-r- nitin/nitin 3804 2012-11-04 00:53 src/test/java/edu/nitin/testcodes/FlumeProcessExecutorTest.java
          rw-rw-r- nitin/nitin 1099 2012-11-02 20:05 pom.xml

          Show
          Nitin Verma added a comment - Attaching a code tar so that you can play around. $ tar -ztvf FlumeProcessRunner.tar.gz rw-rw-r - nitin/nitin 220 2012-11-03 18:44 src/main/java/edu/nitin/testcodes/Destructible.java rw-rw-r - nitin/nitin 233 2012-11-03 18:45 src/main/java/edu/nitin/testcodes/DestrutableRunner.java rw-rw-r - nitin/nitin 8272 2012-11-03 23:43 src/main/java/edu/nitin/testcodes/FlumeProcessExecutor.java rw-rw-r - nitin/nitin 316 2012-11-03 19:14 src/main/java/edu/nitin/testcodes/InputStreamRunner.java rw-rw-r - nitin/nitin 548 2012-11-03 14:36 src/main/java/edu/nitin/testcodes/ProcessExecutionException.java rw-rw-r - nitin/nitin 717 2012-11-03 20:07 src/main/java/edu/nitin/testcodes/ProcessExecutor.java rw-rw-r - nitin/nitin 331 2012-11-02 20:15 src/test/resources/log4j.properties rw-rw-r - nitin/nitin 3804 2012-11-04 00:53 src/test/java/edu/nitin/testcodes/FlumeProcessExecutorTest.java rw-rw-r - nitin/nitin 1099 2012-11-02 20:05 pom.xml
          Hide
          Roshan Naik added a comment -

          Adding another test case (as noted by Nitin), and excluding resource file from rat checks.

          Show
          Roshan Naik added a comment - Adding another test case (as noted by Nitin), and excluding resource file from rat checks.
          Hide
          Roshan Naik added a comment -

          Nitin. I dont understand what problem the suggested code solves. The command you provide should work fine with the proposed patch. Something similar was already part of the test suite. I added the precise command you provide to the test suite anyway.

          Show
          Roshan Naik added a comment - Nitin. I dont understand what problem the suggested code solves. The command you provide should work fine with the proposed patch. Something similar was already part of the test suite. I added the precise command you provide to the test suite anyway.
          Hide
          Brock Noland added a comment -

          Roshan,

          The latest patch should be uploaded to Review Board.

          Brock

          Show
          Brock Noland added a comment - Roshan, The latest patch should be uploaded to Review Board. Brock
          Hide
          Roshan Naik added a comment -

          its there now.

          Show
          Roshan Naik added a comment - its there now.
          Hide
          Hari Shreedharan added a comment -

          Since I don't see consensus on this, how about moving this out to v1.4?

          Show
          Hari Shreedharan added a comment - Since I don't see consensus on this, how about moving this out to v1.4?
          Hide
          Roshan Naik added a comment -

          I am not sure that there is no consensus as I dont see any open comments waiting to be be discussed or addressed IMHO. Do you see any areas where there is lack of consensus ?

          Show
          Roshan Naik added a comment - I am not sure that there is no consensus as I dont see any open comments waiting to be be discussed or addressed IMHO. Do you see any areas where there is lack of consensus ?
          Hide
          Roshan Naik added a comment -

          I may add.. that i am not really stuck on having this in 1.3. Especially if it delays the 1.3 release any longer.

          Show
          Roshan Naik added a comment - I may add.. that i am not really stuck on having this in 1.3. Especially if it delays the 1.3 release any longer.
          Hide
          Brock Noland added a comment -

          I am not 100% on board with this as stated previously. I would like to review the patch again and let that sink in.

          Show
          Brock Noland added a comment - I am not 100% on board with this as stated previously. I would like to review the patch again and let that sink in.
          Hide
          Roshan Naik added a comment -

          Brock, Is this the previously stated comment you are referring to ?

          'Messing around with sub-processes is a well know source of bugs.'

          If so, there is no 'messing around' with the sub process whatsoever.

          If not, please clarify the concern if on reviewing the patch one more time does not alleviate it.

          Show
          Roshan Naik added a comment - Brock, Is this the previously stated comment you are referring to ? 'Messing around with sub-processes is a well know source of bugs.' If so, there is no 'messing around' with the sub process whatsoever. If not, please clarify the concern if on reviewing the patch one more time does not alleviate it.
          Hide
          Brock Noland added a comment -

          Yes I just mean this is quite `yucky' stuff that is a well known location incompatibilities, as such, we should give this patch an uber-review. I should be able to look at it more today.

          Show
          Brock Noland added a comment - Yes I just mean this is quite `yucky' stuff that is a well known location incompatibilities, as such, we should give this patch an uber-review. I should be able to look at it more today.
          Hide
          Roshan Naik added a comment -

          Incorporating review comments and a bug fix in test.

          Show
          Roshan Naik added a comment - Incorporating review comments and a bug fix in test.
          Hide
          Roshan Naik added a comment -

          Incorporating code review feedback.

          Show
          Roshan Naik added a comment - Incorporating code review feedback.
          Hide
          Roshan Naik added a comment -

          incorporating code review feedback...marking fields final in ExecRunnable

          Show
          Roshan Naik added a comment - incorporating code review feedback...marking fields final in ExecRunnable
          Hide
          Roshan Naik added a comment -

          fixing path in test

          Show
          Roshan Naik added a comment - fixing path in test
          Hide
          Brock Noland added a comment -

          Same errors. I think that the file should pulled out the classpath and written to a new file. See how we do it here:

          https://github.com/apache/flume/blob/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannelFormatRegression.java#L63

          Show
          Brock Noland added a comment - Same errors. I think that the file should pulled out the classpath and written to a new file. See how we do it here: https://github.com/apache/flume/blob/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannelFormatRegression.java#L63
          Hide
          Roshan Naik added a comment -

          actually the file in question was missing in the patch. my fault. apologies to brock for the multiple iterations. hope this does it.

          Show
          Roshan Naik added a comment - actually the file in question was missing in the patch. my fault. apologies to brock for the multiple iterations. hope this does it.
          Hide
          Roshan Naik added a comment -

          Brock.. shall i go ahead and close the review on this one ?

          Show
          Roshan Naik added a comment - Brock.. shall i go ahead and close the review on this one ?
          Hide
          Brock Noland added a comment -

          Thanks Roshan! I committed this to trunk and 1.4. Thank you again for your patience on this one!

          Show
          Brock Noland added a comment - Thanks Roshan! I committed this to trunk and 1.4. Thank you again for your patience on this one!
          Hide
          Brock Noland added a comment -

          Hi, yes please do!

          Show
          Brock Noland added a comment - Hi, yes please do!
          Hide
          Hudson added a comment -

          Integrated in flume-trunk #371 (See https://builds.apache.org/job/flume-trunk/371/)
          FLUME-1661: ExecSource cannot execute complex *nix commands (Revision 13b8252bdeb838c606f4453bdf757fb2a1101eb8)

          Result = UNSTABLE
          brock : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=13b8252bdeb838c606f4453bdf757fb2a1101eb8
          Files :

          • flume-ng-core/src/main/java/org/apache/flume/source/ExecSourceConfigurationConstants.java
          • flume-ng-core/src/test/resources/test_command.txt
          • flume-ng-core/src/main/java/org/apache/flume/source/ExecSource.java
          • flume-ng-core/src/test/java/org/apache/flume/source/TestExecSource.java
          • flume-ng-doc/sphinx/FlumeUserGuide.rst
          • pom.xml
          Show
          Hudson added a comment - Integrated in flume-trunk #371 (See https://builds.apache.org/job/flume-trunk/371/ ) FLUME-1661 : ExecSource cannot execute complex *nix commands (Revision 13b8252bdeb838c606f4453bdf757fb2a1101eb8) Result = UNSTABLE brock : http://git-wip-us.apache.org/repos/asf/flume/repo?p=flume.git&a=commit&h=13b8252bdeb838c606f4453bdf757fb2a1101eb8 Files : flume-ng-core/src/main/java/org/apache/flume/source/ExecSourceConfigurationConstants.java flume-ng-core/src/test/resources/test_command.txt flume-ng-core/src/main/java/org/apache/flume/source/ExecSource.java flume-ng-core/src/test/java/org/apache/flume/source/TestExecSource.java flume-ng-doc/sphinx/FlumeUserGuide.rst pom.xml

            People

            • Assignee:
              Roshan Naik
              Reporter:
              Yoonseok Woo
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development