Exec command stays waiting for output and error streams to be closed even when executed command already finished. This bug prevents Ant from execution of processes, that are not closing out and err stream correctly on Windows. Small example is java class only executing its argument: public static void main (String args[]) throws Exception { Runtime.getRuntime().exec(args[0]); System.out.println("finished"); } and build.xml containing something like this: <exec executable="java" > <arg line=" -cp . test rmid"/> </exec> This task starts rmid using test class, writes "finished" and stays hanged on Windows. The same code on Linux(Solaris) starts rmid, writes "finshed" and realy finishes. Main problem is waiting for error and output stream to be closed in org.apache.tools.ant.taskdefs.PumpStreamHandler method stop() code inputThread.join(); and errorThread.join(); Output with Full thread dump of blocked exec task is: Buildfile: build.xml test: [exec] finished Full thread dump: "Thread-1" daemon prio=5 tid=0x8b8ad48 nid=0x604 runnable [0x8f2f000..0x8f2fdbc] at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:166) at org.apache.tools.ant.taskdefs.StreamPumper.run(StreamPumper.java:99) at java.lang.Thread.run(Thread.java:484) "Thread-0" daemon prio=5 tid=0x8b3da98 nid=0x57c runnable [0x8eef000..0x8eefdbc] at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:183) at java.io.BufferedInputStream.fill(BufferedInputStream.java:186) at java.io.BufferedInputStream.read1(BufferedInputStream.java:225) at java.io.BufferedInputStream.read(BufferedInputStream.java:280) at java.io.FilterInputStream.read(FilterInputStream.java:93) at org.apache.tools.ant.taskdefs.StreamPumper.run(StreamPumper.java:99) at java.lang.Thread.run(Thread.java:484) "Signal Dispatcher" daemon prio=10 tid=0x960620 nid=0x670 waiting on monitor [0..0] "Finalizer" daemon prio=9 tid=0x95c880 nid=0x4e8 waiting on monitor [0x8daf000..0x8dafdbc] at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:108) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:123) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:162) "Reference Handler" daemon prio=10 tid=0x8af0368 nid=0x4fc waiting on monitor [0x8d6f000..0x8d6fdbc] at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:420) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:110) "main" prio=5 tid=0x284950 nid=0x60c waiting on monitor [0x6f000..0x6fc34] at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:930) at java.lang.Thread.join(Thread.java:983) at org.apache.tools.ant.taskdefs.PumpStreamHandler.stop (PumpStreamHandler.java:111) at org.apache.tools.ant.taskdefs.LogStreamHandler.stop (LogStreamHandler.java:85) at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:397) at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:250) at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:279) at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:177) at org.apache.tools.ant.Task.perform(Task.java:217) at org.apache.tools.ant.Target.execute(Target.java:184) at org.apache.tools.ant.Target.performTasks(Target.java:202) at org.apache.tools.ant.Project.executeTarget(Project.java:601) at org.apache.tools.ant.Project.executeTargets(Project.java:560) at org.apache.tools.ant.Main.runBuild(Main.java:454) at org.apache.tools.ant.Main.start(Main.java:153) at org.apache.tools.ant.Main.main(Main.java:176) "VM Thread" prio=5 tid=0x8b5e1c0 nid=0x3c8 runnable "VM Periodic Task Thread" prio=10 tid=0x95f320 nid=0x558 waiting on monitor "Suspend Checker Thread" prio=10 tid=0x95fc70 nid=0x608 runnable
Created attachment 860 [details] suggested fix of org.apache.tools.ant.taskdefs.PumpStreamHandler
this bug prevents Ant from running on Windows and I found no workaround
This is an interesting problem, and not one I have seen myself, despite my extensive use of ant on NT. I wonder if it is showing some interesting side effects of the call to exec() inside the sub process. Whatever, your supplied path [NB, please use diff -u in future] would seem to ensure ant continues, and given that the stop() method is called after the process has terminated naturally or been killed by the watchdog should not affect the sub process. However, it runs the risk of leaking threads. This may not seem much on a single ant run, but in an ant-in-gui or automated build system thread leakage can become an issue. Not as much a one as the build blocking, but still an issue. I think therefore that for a patch like this to go into the build, it has to print out big warning messages to the effect that something is wrong with the client app. Also we need to see if anyone else has replicated the problem
Created attachment 868 [details] another proposal how to fix this bug by implementing interruptable read
Created attachment 996 [details] Another more powerfull patch, because bug still ocures in several cases
This one has been here forever and I'm wondering if it is not related in some way to #10345 and #8510. Adam, out of curiosity do you have a testcase for this ?
Oops stupid question. the testcase is here...having a look.
The patch (id=996J) did stop problems I had with execute hanging. Unfortunately the patch also causes the output of my cvs log command to be prematurely truncated. Before applying the patch, my code would hang the second time I executed a CVS log command but all of the output made it to my client code's input buffer. I made the additional following change: The patch to PumpStreamHandler makes changes like: while (inputThread.isAlive()) { inputThread.interrupt(); inputThread.join(TIMEOUT); } I changed these to instead be: if (inputThread.isAlive()) { inputThread.join(TIMEOUT); while (inputThread.isAlive()) { inputThread.interrupt(); inputThread.join(TIMEOUT); } } From reading the previous patches this seems to have been the intent of Adam Sotona all along. He started out with something similar to this and then lost the initial wait in the later version. Immediately interupting the thread is more likely to cause premature closing of the thread. Thereby preventing the client code from obtaining all the output of the executed command. (cvs log in my case) At least with my additional change there is a better chance all the output is pushed into the client's input buffer.
Am I correct in thinking that a call to Process.waitFor() would work, except that StreamPumper does not know about such things? The complication is that the command may not finish if you do not read the output streams but I am still wondering whether more fundamental changes might eventually be worthwhile e.g. Ant 2. StreamPumper looks to me like it might be mostly avoided. Still, the latest proposed fix looks like it would work.
I no longer think StreamPumper is likely to be avoided but it is a shame that it is not dead simple. My own software that encountered the same problem under windoze relied on Process.waitFor(). The nearest thing I have to an ant task can now detect and interrupt a thread processing a stream that is never closed. Sorry I don't have time right now to look into this possibility in the ant case.
Assigning back to ant-dev as Stephane is now jetsetting about
Adam, would you like to try the latest CVS version of ant, where <exec/> seems to be implemented by a different class org.apache.tools.ant.taskdefs.ExecTask if I read properly the defaults.properties file. The code is quite different from the code of the old exec task.
It works for me with : java version "1.4.1_01" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01) Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode) on Win 2K, Service Pack 2 However, I wonder whether what is really happening is not that rmid has been fixed to close its stdout and stderr properly. I tried also with cvs log (which I am doing over ssh), and did not reproduce the problem. So the next question is : how can one create a test class or a shell or Perl script or C program which does not close properly its stdin/stderr streams, so that the problem can be "lab studied" ? Without a possibility to reproduce the problem, this should be closed as WONTFIX or WORKSFORME.
Hi, I mark this bug as resolved for ant 1.6 since nobody has voiced remarks.
bug is still present in 1.6 aand it cause problems for NetBeans 4.0 execution (NetBeans 4.0 build system is now based on Ant). See bug: http://www.netbeans.org/issues/show_bug.cgi?id=49489 simple case I've described before is still reproducible but better try to execute notepad instead of rmid: public static void main(String args[]) throws Exception { Runtime.getRuntime().exec("notepad"); System.out.println("finished"); }
Hi Adam, I do not follow your example with notepad. The Runtime.getRuntime().exec("notepad") will not return until the notepad process stops.
Hi Peter, I do follow, the exec method starts the process and returns. Did you tried that ? BTW: part of the Runtime.exec javadoc: "Executes the specified string command in a separate process."
Notepad is special; it is a GUI app. If you look at how windows execs guis, it does *odd* things, things that only make sense from a historical perspective. try on a command line app instead of notepad.
Adam, I still do not follow. Ant exec is not the same as just calling process.exec(). It's job is to start the process and handle it's input and output <file descriptors|handles> and wait for it to finish. One can use the "spawn" attribute to spawn off the process and not care about it. note, the notepad program is not the issue the follows also works in unix: public class Test { public static void main(String[] args) { try { Runtime.getRuntime().exec("emacs"); System.out.println("finished"); Thread.sleep(10 * 1000); } catch (Exception ex) { ex.printStackTrace(); } } } The parent process of the emacs process is the java program, and when it dies, the parent process becomes the init process.
so once again: - this bug occures on Windows only - if you just execute emacs from Java on Linux - the Java process finishes - OK - if you'll do it using Ant - it finishes - OK - if you execute notepad (or whatever you want) on Windows - the Java process finishes - again correct behavior - but if you'll do it through Ant on Windows it will wait till ALL executed processes close their streams and that's not correct everything was already described here and several patches were proposed you just need to cut the streams pumping when the process dies on Windows after some timeout - that's all
Created attachment 13680 [details] build file showing the problem Just do ant in the directory with the build file It makes a src directory, and populates it with two java files The files are compiles, and an exec is run "java -cp classes CallHello" On unix, the build finishes just after the "finished" message from CallHello On windows, the build finishes about 19 seconds after the "finished" message
Ok, I see what you are saying now. On windows child processes keep the std and std outout file handles of the master process (or at least Runtime#exec() is implemented in this way), on Unix this does not happen. This means that one can start a master process from ant. This master process can create a number of child processes. The master process then terminates, but the child processes of the master process are still running. On Unix, the exec task while end at this time, but on windows this will not happen, in fact the exec task will wait until all the children of the master process have terminated - this is *not* good, especially for something like rmid. I have added an attachment that shows the problem.
I am reassigning this bug to the whole ant community, because I do not have any special solution (I think my name was in there since 2003, at a time when the issue was inactive).
*** Bug 28135 has been marked as a duplicate of this bug. ***
*** Bug 37787 has been marked as a duplicate of this bug. ***
*** Bug 42534 has been marked as a duplicate of this bug. ***
Created attachment 22009 [details] INTERVIEW EVALUATION FORM
Comment on attachment 22009 [details] INTERVIEW EVALUATION FORM this has nothing to do with the bug; marking as obsolete.
a loooooooooooooooong time, I know. Ant's code has changed a bit, so some extra work has become necessary. That other classes are now using StreamPumper as well didn't help either. With the original patch (even if adapted) several unit tests of Ant would hang and never return - I guess this has been true seven years ago as well. One major problem I faced was that available() returns 0 on a closed stream on some VMs (it did on Suns 1.4.2 for Windows, for example) and thus the available trick doesn't work unless you are sure you are going to interrupt the thread running StreamPumper eventually. I've also noticed that the approach using available impacts performance considerably, so I've restricted it to the platform (Windows) where it is needed (like the original patch did, but for a different reason). svn revision 711860
*** Bug 46805 has been marked as a duplicate of this bug. ***
I just want to mention my fix which is posted at bug 46805. (sorry for creating a duplicate) This bug is caused by closing unowned streams: new XyzStream() --> close it getXyzStream() --> don't close it, close/destroy the underlying object
reopened since an unresolved bug is merged into this.
if there really is an issue caused by closing the streams than bug 46805 is no duplicate of bug 5003
Using Ant 1.8.2 I still reproduce the error.
(In reply to comment #34) > Using Ant 1.8.2 I still reproduce the error. How?
Created attachment 28364 [details] patch to StreamPumper.run to make it responsive to interrupts I ran into the same issue and was able to get the ant JVM unstuck by modifying the org.apache.tools.ant.taskdefs.StreamPumper.run() to make it more responsive to interrupt conditions. Please see my diff to /ant-trunk/org/apache/tools/ant/taskdefs/StreamPumper.java and let me know if addreses any pending issues.
Created attachment 28365 [details] revised earlier patch slightly ... does the write more efficiently revised earlier patch
I had something like this in some of my code. I found process.join() was returning, but my joins on the workers reading the input streams were not. you could assume if the process has been dead, or has been dead a certain amount of time, you no longer care about the data on the streams. I ran into something like this when killing an external process. In my case, the choice is easy. I am forcibly terminating it. I don't care about the streams. Basically you get a reference to the streams while the process is still alive. And then at any point after it's dead, (or should be) you close the streams. This worked like a charm for me. I could do this where I join the process as well, instead of where I kill it. So.... OutputStream outputStream = process.getOutputStream(); InputStream inputStream = process.getInputStream(); InputStream errorStream = process.getErrorStream(); process.destroy(); try { outputStream.flush(); } catch(IOException e) { } try { outputStream.close(); } catch(IOException e) { } try { errorStream.close(); } catch(IOException e) { } try { inputStream.close(); } catch(IOException e) { }
Ha. Strike that. Sometimes the calls to close() still deadlock in a native method.
Without looking at the Ant implementation: Could this be related to http://grumpyapache.blogspot.com/2020/10/incompatibility-of-processwaitfor.html
@Jochen: your link explains why my hack (adding timeouts to join()) helped.
Is this still noticed against Java 11 (or Java 16) versions?