Bug 48746 - exec task sometimes inserts extraneous newlines
Summary: exec task sometimes inserts extraneous newlines
Status: REOPENED
Alias: None
Product: Ant
Classification: Unclassified
Component: Core tasks (show other bugs)
Version: 1.8.2
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: 1.8.1
Assignee: Ant Notifications List
URL: http://netbeans.org/bugzilla/show_bug...
Keywords:
Depends on: 49384
Blocks: 5003
  Show dependency tree
 
Reported: 2010-02-15 21:05 UTC by Jay Berkenbilt
Modified: 2010-12-27 11:12 UTC (History)
1 user (show)



Attachments
patch of PumpStreamHandler (689 bytes, text/plain)
2010-02-21 05:13 UTC, Antoine Levy-Lambert
Details
Handy diagnostic patch (2.52 KB, patch)
2010-05-28 12:37 UTC, Jesse Glick
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jay Berkenbilt 2010-02-15 21:05:58 UTC
I'm running apache ant 1.8.0 on Red Hat Enterprise Linux 5.4, x86_64, with the 64-bit version of java 1.6.0_18 from sun (the latest available at least as of last week).  I can also see this behavior with 1.5.0_15, which is used in some of our development environments for various reasons.

With ant 1.8.0, there is a fairly high probability that ant's exec task will generate extraneous newlines in command output.  For example, with this build.xml file:

<project default="all">
 <target name="all">
  <exec executable="cat">
   <arg value="/tmp/a"/>
  </exec>
 </target>
</project>

and the following /tmp/a:

01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789

"most" of the time, I get the expected output:

Buildfile: /tmp/build.xml

all:
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789

BUILD SUCCESSFUL
Total time: 0 seconds

but sometimes I get variations with an extraneous line break in an indeterminate spot.  For example:

     [exec] 01234567890123456789012345
     [exec] 678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789
     [exec] 01234567890123456789012345678901234567890123456789

BUILD SUCCESSFUL
Total time: 0 seconds

If I run ant 50 times in a loop, I see this or similar behavior (the line break is not always in the same place) about 50% of the time.  I'm running ANT like this:

while true; do if JAVA_HOME=~/tmp/jdk1.6.0_18 $ANT_HOME/bin/ant -f /tmp/build.xml | diff - /tmp/a1; then echo yes >> /tmp/a2; else echo no >> /tmp/a2; done

where /tmp/a1 just contains what the ant command is supposed to contain.

I never see this behavior with 1.7.0, and I don't believe I've seen it with 1.7.1 either.

This problem is causing false test failures for us and will thus prevent us from upgrading to 1.8.0 pending a solution.  A quick diff of ExecTask.java between versions doesn't reveal any relevant changes.  Any suggestions on where in the code to check?  Let me know if there's anything you need me to test or submit.
Comment 1 Antoine Levy-Lambert 2010-02-15 23:22:28 UTC
Hi,

I have tried this on my mac and i can confirm this.
I have this java version :

java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04-248-10M3025)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01-101, mixed mode)

and MacOS Snow Leopard.

My guess is that the bits of codes which would cause this must be handling input output, probably PumpStreamHandler and the classes it delegates to.

Antoine
Comment 2 Antoine Levy-Lambert 2010-02-21 05:13:40 UTC
Created attachment 25031 [details]
patch of PumpStreamHandler

This patch fixes the issue. 

I will probably commit it soon, but if meanwhile some one can explain what is the justification of a special wait behavior in StreamPumper for Windows systems ? The change makes all operating systems behave the same in this respect, so looping to wait for input in the streamPumper#waitForInput method.

I do not have yet a unit test for this issue. I am also wondering whether this change might not make some unit tests which are not behaving consistently behave more consistently.
Comment 3 Antoine Levy-Lambert 2010-02-21 06:09:24 UTC
committed my change.

We have an antunit test case testRedirector13 in src/tests/antunit/taskdefs/exec/apply-test.xml which was failing inconsistently. I bet that this change in PumpStreamHandler will prevent this test from failing.
Comment 4 Jesse Glick 2010-05-25 12:46:15 UTC
The patch causes a regression in stream handling inside NetBeans: under some circumstances, an <exec> gobbles up all subsequent output from Ant! I recommend this patch be rolled back for 1.8.2.

I am not sure if there is any reliable fix for the originally reported bug; Ant tries to log messages corresponding to complete lines, even in case the final \n is missing; but if it receives some characters without a newline it simply cannot know whether or not more characters are coming in the same line, so it just waits a short while to see if any arrive. Increasing StreamPumper.POLL_INTERVAL might reduce the likelihood of problems (untested).

Why would a test be checking the output of Ant's log, anyway? This should be considered informational. You can capture the output of <exec> to a file if it must be exact.
Comment 5 Jay Berkenbilt 2010-05-25 17:04:38 UTC
Just to answer the question of why a test would be looking at ant's output, the test suite is for a build tool that invokes ant (as well as doing many other things).  I already have filters in the test suite to compensate for unpredictable aspects of ant's output, such as durations.  If it is deemed that the output of <exec> is not reliable when output to the screen, I will just modify my test suite to normalize it in some way.  For my original purposes, I don't care about it in the general case.  I just need to make sure ant is doing what it is supposed to do when invoked from my other tool.

In other words, running ant is the objective in this case, not a means to an end.  So I care about ant's output.  But since I'm not testing ant, I can work around its quirks without damaging the integrity of my open application's test suite.
Comment 6 Antoine Levy-Lambert 2010-05-25 18:13:48 UTC
Hello Jesse,

could you explain in more details the problem that you have in Netbeans ? 

Maybe the issue that you have in NetBeans can be resolved in the netbeans integration.

there is a comment in your Netbeans bug report saying this :

"I've switched to a local (non-bundled) installation of Ant 1.8.1 (original
release from apache). This fixes the bug."


The change which I did for this bug 48746 was to systematically set useAvailable=true in o.a.t.a.t.StreamPumper. Before, useavailable was set to true on Windows and false on other platforms.

I was able to reproduce the problem described in this bug 48746 of lines being cut in the log of ant. I reproduced this on my MAC. If I did what is described in the bug report 50 times, I would get 30 times approximatively lines broken.

I did not create a JUnit testcase for the original issue, which I should do to prevent regression.

Regards,

Antoine
Comment 7 Jesse Glick 2010-05-26 09:46:30 UTC
(In reply to comment #6)
> could you explain in more details the problem that you have in Netbeans? 

I think the URL field explains it in detail. Under 1.8.1, with that example, output from <exec> is never displayed, nor is any subsequent output from Ant.

> Maybe the issue that you have in NetBeans can be resolved in the netbeans
> integration.

It is possible, and I will try to investigate the root cause, but issues involving stdio are especially nasty to track down so reverting the problematic patch seems safest.

> The change which I did for this bug 48746 was to systematically set
> useAvailable=true in o.a.t.a.t.StreamPumper. Before, useavailable was set to
> true on Windows and false on other platforms.

Right. And if I set it back to false on Linux, everything is fine. I will double-check on XP, but I think platform-specific differences in stdio handling meant that there was no problem on Windows to begin with.
Comment 8 Jesse Glick 2010-05-26 18:13:09 UTC
Oddly enough, the bug in the test case is reproducible in XP in 1.8.1 (not in 1.8.0). This fixes it for both Linux and XP:

Index: src/main/org/apache/tools/ant/taskdefs/PumpStreamHandler.java
===================================================================
--- src/main/org/apache/tools/ant/taskdefs/PumpStreamHandler.java	(revision 948598)
+++ src/main/org/apache/tools/ant/taskdefs/PumpStreamHandler.java	(working copy)
@@ -244,7 +244,7 @@
         final Thread result
             = new ThreadWithPumper(new StreamPumper(is, os,
                                                     closeWhenExhausted,
-                                                    true));
+                                                    false));
         result.setDaemon(true);
         return result;
     }

But I have not yet checked the effect on the test cases from bug #5003.
Comment 9 Jay Berkenbilt 2010-05-26 18:17:37 UTC
So, I'm just a user and am not familiar with the internals of ant, but isn't it pretty clear that if changing this parameter to true breaks one thing and changing it to false breaks something else, that neither code is correct?  I hope we don't fall into the trap here of choosing which bug we'd rather have but instead continue to investigate why the change to fix the extraneous newline problem causes the other issue.  Sorry if I'm stating the obvious here.
Comment 10 Jesse Glick 2010-05-28 12:34:56 UTC
Caused by PumpStreamHandler.finish interrupting the thread, which can cause a listener based on java.nio.channel to abruptly terminate with a ClosedByInterruptException. I think somehow timing-dependent; the extra waiting for output caused by useAvailable probably causes the pumper thread to hit the JOIN_TIMEOUT, thus falling through to the backup behavior of interrupting it until it dies.

Workaround on listener side is to forcibly clear the thread interrupt status immediately before writing to the channel. Still subject to a race condition but seems to work.
Comment 11 Jesse Glick 2010-05-28 12:37:03 UTC
Created attachment 25494 [details]
Handy diagnostic patch
Comment 12 Stefan Bodewig 2010-06-14 06:30:21 UTC
There also is bug 46805 which somehow may also be related.

Of the suggested changes over there we have performed one (join timeout) but
neglected two that we may want to rethink:

* unconditionally call process.destroy and don't close the streams at all
  (rely on process.destroy to do that).  I'm not convinced this is going to help
  or even poses a problem.  At the point where we close the streams either
  process.waitFor has succeeded or we have invoked process.destroy

* first close the streams and then stop the threads
  At first glance I'd expect this change to make it more likely to miss
  output written by the process but at the same time it may force buffers to
  get flushed.

I'll look into bringing real testcases for bug 5003 into Ant's trunk so we can
test a few things.  I don't think I ever checked in the ones I used by the
end of 2008 and don't have them anymore, so I'll have to recreate them.
Comment 13 Stefan Bodewig 2010-06-14 06:47:39 UTC
I did turn Peter's example code into an AntUnit test back then in
svn revision 927753 

testDoesntWaitForChildren in src/tests/antunit/taskdefs/exec/exec-test.xml
takes three seconds on my Win7 machine when run with Ant 1.8.[01] and
21 seconds with Ant 1.7.1.

If I apply the suggested change of comment 8 I'm back at 22 seconds again,
swapping close/stopThreads doesn't change that either.  But if I swap them,
I don't even see the "finished" output the original process executed
so it would actually make things worse.

So whatever we do, the flag has to remain true on Windows.
Comment 14 Baron Roberts 2010-06-28 23:27:03 UTC
I have been experiencing the LeadPipeInputStream Broken Pipe error with Ant 1.8.1 and Antoine pointed me at this issue. I am running on Windows XP and my use case is the Java Task with a LineCOntainsRegExp output filter chain redirector. In Ant 1.7.1 I would see the broken pipe error very infrequently, once every 5 - 10 builds. In Ant 1.8.1 I see it with every build.

I have tried the 6/28/2010 trunk source and still see the problem. I have also tried changing the flag in createPumper and still see the problem.
Comment 15 Greg Schueler 2010-07-01 20:40:17 UTC
This bug appears unresolved in ant 1.8.1.

using 1.8.1 release and Jay's original test build.xml I can easily see it.

Mac OS X 10.6.4, java version "1.6.0_20"