Affects Version/s: 2.20.1, 2.22.0, 3.0.0-M2, 3.0.0-M1
Fix Version/s: None
Component/s: Maven Surefire Plugin
I'm seeing spurious "The forked VM terminated without properly saying goodbye. VM crash or System.exit called?" messages when running unit tests in a big multi-module project.
OS: Windows 10, running Maven 3.5.0 to 3.6.0 and different versions of Surefire (2.20.1 to 3.0.0.-M2), Java 8u171 to 8u191.
I'm running Maven from the command line using MinTTY (Cygwin).
Things I tried which have no effect:
- Reboot / Cold boot (happens first thing on Monday morning when I come into the office and turn on my PC).
- More free memory (happens when I only have a single window open). I have 16GB of RAM.
- Different terminal. I tried CMD prompt and urxvt (Cygwin/X).
- Different versions of the Surefire plugin or Maven
- Different JDK 8 builds
Things that affect the bug:
- Redirecting Maven's stdout to a file: mvn ... | tee mvn.log
- Redirecting all log output to a file using logback-test.xml
- Running Surefire with forkCount=0
- Running a subset of the tests (-Dtest=...)
- Pending Windows updates (I think, not sure about this one).
Counts: I've never seen it with forkCount=0 (~ 20 test builds). I've never seen it with redirecting log output (~ 10 builds). Redirecting sometimes helps but not always.
One thing which I notice is that one of the tests creates an ActiveMQ broker and uses a shutdown hook to stop it. So I created a small test project which demonstrates that Surefire will sometimes cut off stdout. I think that happens because the main process kills the child after a timeout (correct?).
So my guess would be that shutdown hooks can mess with the pipeline between the surefire child VM and main Maven process. ActiveMQ might be worse since it stops threads and execution pools (so the output comes slowly with a couple of exceptions sprinkled in when one component loses connection because another is shutting down).
But now, it gets weird. When the build succeeds, it takes about ~5 minutes to run 1028 tests. The log is 25 MB.
When it fails, it takes ~8 minutes to run ~700-800 tests (this number varies) and the log stops in the middle of a test but is also 25 MB.
Some of the time discrepancy is probably because writing to a file is faster than printing on a terminal. The strange part is that the log file is about the same size but 30% of the tests haven't run. Most tests log a lot, do I would expect to see a difference of at least a few MB. The Maven part (which contains escape sequences, etc). is just 60 KB.
Maybe the parent takes some part of the log output as "child terminated".
I'm running out of ideas what to try next. I think a way to log the communication between parent and child would help. Also the parent should terminate the child and then read stdout until EOF to we can see anything that happens afterwards.