Bug 5003 - exec task does not return after executed command finished on Windows only
Summary: exec task does not return after executed command finished on Windows only
Status: REOPENED
Alias: None
Product: Ant
Classification: Unclassified
Component: Core tasks (show other bugs)
Version: 1.4.1
Hardware: PC All
: P3 blocker with 5 votes (vote)
Target Milestone: 1.8.0
Assignee: Ant Notifications List
URL:
Keywords:
: 28135 37787 42534 (view as bug list)
Depends on: 48746
Blocks: 54128
  Show dependency tree
 
Reported: 2001-11-21 08:22 UTC by Adam Sotona
Modified: 2021-04-12 14:59 UTC (History)
6 users (show)



Attachments
suggested fix of org.apache.tools.ant.taskdefs.PumpStreamHandler (250 bytes, patch)
2001-12-03 07:25 UTC, Adam Sotona
Details | Diff
another proposal how to fix this bug by implementing interruptable read (2.20 KB, patch)
2001-12-04 04:02 UTC, Adam Sotona
Details | Diff
Another more powerfull patch, because bug still ocures in several cases (2.56 KB, patch)
2002-01-11 06:15 UTC, Adam Sotona
Details | Diff
build file showing the problem (881 bytes, text/plain)
2004-12-08 12:06 UTC, Peter Reilly
Details
INTERVIEW EVALUATION FORM (42.50 KB, application/msword)
2008-05-27 03:27 UTC, narendra.narne
Details
patch to StreamPumper.run to make it responsive to interrupts (961 bytes, patch)
2012-02-22 20:19 UTC, Rohit Kelapure
Details | Diff
revised earlier patch slightly ... does the write more efficiently (985 bytes, patch)
2012-02-22 20:46 UTC, Rohit Kelapure
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Sotona 2001-11-21 08:22:32 UTC
Exec command stays waiting for output and error streams to be closed even when 
executed command already finished.

This bug prevents Ant from execution of processes, that are not closing out and 
err stream correctly on Windows.

Small example is java class only executing its argument:
    public static void main (String args[]) throws Exception {
        Runtime.getRuntime().exec(args[0]);
        System.out.println("finished");
    }

and build.xml containing something like this:
        <exec executable="java" >
            <arg line=" -cp . test rmid"/>
        </exec>

This task starts rmid using test class, writes "finished" and stays hanged on 
Windows.
The same code on Linux(Solaris) starts rmid, writes "finshed" and realy 
finishes.

Main problem is waiting for error and output stream to be closed in 
org.apache.tools.ant.taskdefs.PumpStreamHandler method stop() code             
inputThread.join(); and errorThread.join();

Output with Full thread dump of blocked exec task is:
Buildfile: build.xml

test:
     [exec] finished
Full thread dump:

"Thread-1" daemon prio=5 tid=0x8b8ad48 nid=0x604 runnable [0x8f2f000..0x8f2fdbc]
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:166)
        at org.apache.tools.ant.taskdefs.StreamPumper.run(StreamPumper.java:99)
        at java.lang.Thread.run(Thread.java:484)

"Thread-0" daemon prio=5 tid=0x8b3da98 nid=0x57c runnable [0x8eef000..0x8eefdbc]
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:183)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:186)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:225)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:280)
        at java.io.FilterInputStream.read(FilterInputStream.java:93)
        at org.apache.tools.ant.taskdefs.StreamPumper.run(StreamPumper.java:99)
        at java.lang.Thread.run(Thread.java:484)

"Signal Dispatcher" daemon prio=10 tid=0x960620 nid=0x670 waiting on monitor 
[0..0]

"Finalizer" daemon prio=9 tid=0x95c880 nid=0x4e8 waiting on monitor 
[0x8daf000..0x8dafdbc]
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:108)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:123)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:162)

"Reference Handler" daemon prio=10 tid=0x8af0368 nid=0x4fc waiting on monitor 
[0x8d6f000..0x8d6fdbc]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:110)

"main" prio=5 tid=0x284950 nid=0x60c waiting on monitor [0x6f000..0x6fc34]
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:930)
        at java.lang.Thread.join(Thread.java:983)
        at org.apache.tools.ant.taskdefs.PumpStreamHandler.stop
(PumpStreamHandler.java:111)
        at org.apache.tools.ant.taskdefs.LogStreamHandler.stop
(LogStreamHandler.java:85)
        at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:397)
        at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:250)
        at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:279)
        at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:177)
        at org.apache.tools.ant.Task.perform(Task.java:217)
        at org.apache.tools.ant.Target.execute(Target.java:184)
        at org.apache.tools.ant.Target.performTasks(Target.java:202)
        at org.apache.tools.ant.Project.executeTarget(Project.java:601)
        at org.apache.tools.ant.Project.executeTargets(Project.java:560)
        at org.apache.tools.ant.Main.runBuild(Main.java:454)
        at org.apache.tools.ant.Main.start(Main.java:153)
        at org.apache.tools.ant.Main.main(Main.java:176)

"VM Thread" prio=5 tid=0x8b5e1c0 nid=0x3c8 runnable

"VM Periodic Task Thread" prio=10 tid=0x95f320 nid=0x558 waiting on monitor
"Suspend Checker Thread" prio=10 tid=0x95fc70 nid=0x608 runnable
Comment 1 Adam Sotona 2001-12-03 07:25:37 UTC
Created attachment 860 [details]
suggested fix of org.apache.tools.ant.taskdefs.PumpStreamHandler
Comment 2 Adam Sotona 2001-12-03 07:28:54 UTC
this bug prevents Ant from running on Windows and I found no workaround
Comment 3 Steve Loughran 2001-12-03 12:05:04 UTC
This is an interesting problem, and not one I have seen myself, despite my 
extensive use of ant on NT. I wonder if it is showing some interesting side 
effects of the call to exec() inside the sub process.

Whatever, your supplied path [NB, please use diff -u in future] would seem to 
ensure ant continues, and given that the stop() method is called after the 
process has terminated naturally or been killed by the watchdog should not 
affect the sub process.

However, it runs the risk of leaking threads. This may not seem much on a 
single ant run, but in an ant-in-gui or automated build system thread leakage 
can become an issue. Not as much a one as the build blocking, but still an 
issue.

I think therefore that for a patch like this to go into the build, it has to 
print out big warning messages to the effect that something is wrong with the 
client app. Also we need to see if anyone else has replicated the problem





Comment 4 Adam Sotona 2001-12-04 04:02:59 UTC
Created attachment 868 [details]
another proposal how to fix this bug by implementing interruptable read
Comment 5 Adam Sotona 2002-01-11 06:15:59 UTC
Created attachment 996 [details]
Another more powerfull patch, because bug still ocures in several cases
Comment 6 Stephane Bailliez 2002-07-16 14:04:09 UTC
This one has been here forever and I'm wondering if it is not related in some 
way to #10345 and #8510. Adam, out of curiosity do you have a testcase for 
this ?
Comment 7 Stephane Bailliez 2002-07-16 14:16:52 UTC
Oops stupid question. the testcase is here...having a look.
Comment 8 James Lee Carpenter 2002-08-20 00:08:21 UTC
The patch (id=996J) did stop problems I had with execute hanging.  
Unfortunately the patch also causes the output of my cvs log command to be 
prematurely truncated.  Before applying the patch, my code would hang the 
second time I executed a CVS log command but all of the output made it to my 
client code's input buffer.

I made the additional following change:
The patch to PumpStreamHandler makes changes like:

while (inputThread.isAlive()) {
	inputThread.interrupt();
	inputThread.join(TIMEOUT);
}

I changed these to instead be:
		
if (inputThread.isAlive()) {
	inputThread.join(TIMEOUT);
	while (inputThread.isAlive()) {
		inputThread.interrupt();
		inputThread.join(TIMEOUT);
	}
}

From reading the previous patches this seems to have been the intent of Adam 
Sotona all along.  He started out with something similar to this and then lost 
the initial wait in the later version.

Immediately interupting the thread is more likely to cause premature closing of 
the thread.  Thereby preventing the client code from obtaining all the output 
of the executed command.  (cvs log in my case)  At least with my additional 
change there is a better chance all the output is pushed into the client's 
input buffer.
Comment 9 JimWright 2002-09-26 23:52:50 UTC
Am I correct in thinking that a call to Process.waitFor() would work, except
that StreamPumper does not know about such things? The complication
is that the command may not finish if you do not read the output streams but
I am still wondering whether more fundamental changes might eventually be
worthwhile e.g. Ant 2. StreamPumper looks to me like it might be mostly avoided.

Still, the latest proposed fix looks like it would work.
Comment 10 JimWright 2002-10-03 00:10:57 UTC
I no longer think StreamPumper is likely to be avoided
but it is a shame that it is not dead simple.

My own software that encountered the same problem under
windoze relied on Process.waitFor(). The nearest thing I have
to an ant task can now detect and interrupt a thread processing a
stream that is never closed. Sorry I don't have time right
now to look into this possibility in the ant case.
Comment 11 Conor MacNeill 2003-01-17 02:27:31 UTC
Assigning back to ant-dev as Stephane is now jetsetting about
Comment 12 Antoine Levy-Lambert 2003-02-03 21:23:02 UTC
Adam, would you like to try the latest CVS version of ant, where <exec/> seems to be implemented by a 
different class org.apache.tools.ant.taskdefs.ExecTask if I read properly the 
defaults.properties file.
The code is quite different from the code of the old exec task.
Comment 13 Antoine Levy-Lambert 2003-06-04 08:55:09 UTC
It works for me with :

java version "1.4.1_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01)
Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode)

on 

Win 2K, Service Pack 2

However, I wonder whether what is really happening is not that rmid has been 
fixed to close its stdout and stderr properly.

I tried also with cvs log (which I am doing over ssh), and did not reproduce the 
problem.

So the next question is :
how can one create a test class or a shell or Perl script or C program which 
does not close properly its stdin/stderr streams, so that the problem can be 
"lab studied" ?

Without a possibility to reproduce the problem, this should be closed as WONTFIX 
or WORKSFORME.
Comment 14 Antoine Levy-Lambert 2003-06-19 21:35:29 UTC
Hi, I mark this bug as resolved for ant 1.6 since nobody has voiced remarks.
Comment 15 Adam Sotona 2004-09-24 07:52:06 UTC
bug is still present in 1.6 aand it cause problems for NetBeans 4.0 execution
(NetBeans 4.0 build system is now based on Ant).
See bug: http://www.netbeans.org/issues/show_bug.cgi?id=49489

simple case I've described before is still reproducible but better try to
execute notepad instead of rmid:

    public static void main(String args[]) throws Exception {
        Runtime.getRuntime().exec("notepad");
        System.out.println("finished");
    }
Comment 16 Peter Reilly 2004-12-02 18:09:02 UTC
Hi Adam,
I do not follow your example with notepad.
The Runtime.getRuntime().exec("notepad") will not
return until the notepad process stops.

Comment 17 Adam Sotona 2004-12-02 19:52:51 UTC
Hi Peter,
I do follow, the exec method starts the process and returns. Did you tried that ?

BTW: part of the Runtime.exec javadoc:
 "Executes the specified string command in a separate process."
Comment 18 Steve Loughran 2004-12-02 21:17:52 UTC
Notepad is special; it is a GUI app. If you look at how windows execs guis, it
does *odd* things, things that only make sense from a historical perspective.


try on a command line app instead of notepad.
Comment 19 Peter Reilly 2004-12-03 12:03:32 UTC
Adam, I still do not follow.
Ant exec is not the same as just calling process.exec().
It's job is to start the process and handle it's input
and output <file descriptors|handles> and wait for it to finish.

One can use the "spawn" attribute to spawn off the process and
not care about it.

note, the notepad program is not the issue the follows also
works in unix:

public class Test {
    public static void main(String[] args) {
        try {
            Runtime.getRuntime().exec("emacs");
            System.out.println("finished");
            Thread.sleep(10 * 1000);
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

The parent process of the emacs process is the java program, and when it
dies, the parent process becomes the init process.
Comment 20 Adam Sotona 2004-12-04 18:29:34 UTC
so once again:
- this bug occures on Windows only

- if you just execute emacs from Java on Linux - the Java process finishes - OK
- if you'll do it using Ant - it finishes - OK
- if you execute notepad (or whatever you want) on Windows - the Java process
finishes - again correct behavior
- but if you'll do it through Ant on Windows it will wait till ALL executed
processes close their streams and that's not correct

everything was already described here and several patches were proposed

you just need to cut the streams pumping when the process dies on Windows after
some timeout - that's all
Comment 21 Peter Reilly 2004-12-08 12:06:45 UTC
Created attachment 13680 [details]
build file showing the problem

Just do ant in the directory with the build file
It makes a src directory, and populates it with two java files
The files are compiles, and an exec is run "java -cp classes CallHello"
On unix, the build finishes just after the "finished" message from CallHello
On windows, the build finishes about 19 seconds after the "finished" message
Comment 22 Peter Reilly 2004-12-08 12:11:25 UTC
Ok, I see what you are saying now.
On windows child processes keep the std and std outout file handles of
the master process (or at least Runtime#exec() is implemented in this
way), on Unix this does not happen.

This means that one can start a master process from ant. This master
process can create a number of child processes. The master process
then terminates, but the child processes of the master process are
still running. On Unix, the exec task while end at this time, but
on windows this will not happen, in fact the exec task will wait until
all the children of the master process have terminated - this is *not* good,
especially for something like rmid.

I have added an attachment that shows the problem.
Comment 23 Antoine Levy-Lambert 2004-12-27 11:41:22 UTC
I am reassigning this bug to the whole ant community, because I do not have any
special solution (I think my name was in there since 2003, at a time when the
issue was inactive).
Comment 24 Sean Dockery 2005-12-03 05:18:46 UTC
*** Bug 28135 has been marked as a duplicate of this bug. ***
Comment 25 Peter Reilly 2006-10-16 07:59:37 UTC
*** Bug 37787 has been marked as a duplicate of this bug. ***
Comment 26 J.M. (Martijn) Kruithof 2007-07-01 02:05:33 UTC
*** Bug 42534 has been marked as a duplicate of this bug. ***
Comment 27 narendra.narne 2008-05-27 03:27:47 UTC
Created attachment 22009 [details]
INTERVIEW  EVALUATION FORM
Comment 28 Steve Loughran 2008-05-27 07:36:09 UTC
Comment on attachment 22009 [details]
INTERVIEW  EVALUATION FORM

this has nothing to do with the bug; marking as obsolete.
Comment 29 Stefan Bodewig 2008-11-06 06:34:25 UTC
a loooooooooooooooong time, I know.

Ant's code has changed a bit, so some extra work has become necessary.  That
other classes are now using StreamPumper as well didn't help either.

With the original patch (even if adapted) several unit tests of Ant would hang
and never return - I guess this has been true seven years ago as well.

One major problem I faced was that available() returns 0 on a closed stream
on some VMs (it did on Suns 1.4.2 for Windows, for example) and thus the
available trick doesn't work unless you are sure you are going to interrupt
the thread running StreamPumper eventually.

I've also noticed that the approach using available impacts performance considerably, so I've restricted it to the platform (Windows) where it is needed (like the original patch did, but for a different reason).

svn revision 711860
Comment 30 Stefan Bodewig 2009-05-07 02:27:32 UTC
*** Bug 46805 has been marked as a duplicate of this bug. ***
Comment 31 Stefan Franke 2009-05-07 02:43:24 UTC
I just want to mention my fix which is posted at bug 46805. (sorry for creating a duplicate)

This bug is caused by closing unowned streams:

 new XyzStream() --> close it
 getXyzStream() --> don't close it, close/destroy the underlying object
Comment 32 Stefan Franke 2009-05-07 02:45:37 UTC
reopened since an unresolved bug is merged into this.
Comment 33 Stefan Bodewig 2009-05-11 06:39:55 UTC
if there really is an issue caused by closing the streams than bug 46805 is no duplicate of bug 5003
Comment 34 Alexey 2011-09-30 14:23:19 UTC
Using Ant 1.8.2 I still reproduce the error.
Comment 35 Stefan Bodewig 2011-09-30 15:17:36 UTC
(In reply to comment #34)
> Using Ant 1.8.2 I still reproduce the error.

How?
Comment 36 Rohit Kelapure 2012-02-22 20:19:22 UTC
Created attachment 28364 [details]
patch to StreamPumper.run to make it responsive to interrupts

I ran into the same issue and was able to get the ant JVM unstuck by modifying the org.apache.tools.ant.taskdefs.StreamPumper.run() to make it more responsive to interrupt conditions. 

Please see my diff to /ant-trunk/org/apache/tools/ant/taskdefs/StreamPumper.java and let me know if addreses any pending issues.
Comment 37 Rohit Kelapure 2012-02-22 20:46:06 UTC
Created attachment 28365 [details]
revised earlier patch slightly ... does the write more efficiently

revised earlier patch
Comment 38 James Wartell 2013-04-12 17:48:28 UTC
I had something like this in some of my code.

I found process.join() was returning, but my joins on the workers reading the input streams were not. 

you could assume if the process has been dead, or has been dead a certain amount of time, you no longer care about the data on the streams. I ran into something like this when killing an external process. In my case, the choice is easy. I am forcibly terminating it. I don't care about the streams. Basically you get a reference to the streams while the process is still alive. And then at any point after it's dead, (or should be) you close the streams. This worked like a charm for me. I could do this where I join the process as well, instead of where I kill it.

So....


	OutputStream outputStream = process.getOutputStream();
	InputStream inputStream = process.getInputStream();
	InputStream errorStream = process.getErrorStream();

	process.destroy();


	try
	{
		outputStream.flush();
	}
	catch(IOException e)
	{
	}

	try
	{
		outputStream.close();
	}
	catch(IOException e)
	{
	}

	try
	{
		errorStream.close();
	}
	catch(IOException e)
	{
	}

	try
	{
		inputStream.close();
	}
	catch(IOException e)
	{
	}
Comment 39 James Wartell 2013-04-23 16:26:57 UTC
Ha. Strike that. Sometimes the calls to close() still deadlock in a native method.
Comment 40 Jochen Wiedmann 2021-04-12 14:20:24 UTC
Without looking at the Ant implementation: Could this be related to

http://grumpyapache.blogspot.com/2020/10/incompatibility-of-processwaitfor.html
Comment 41 Stefan Franke 2021-04-12 14:34:22 UTC
@Jochen: your link explains why my hack (adding timeouts to join()) helped.
Comment 42 Jaikiran Pai 2021-04-12 14:59:09 UTC
Is this still noticed against Java 11 (or Java 16) versions?