Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
This may be the reason why tests behave as crazy as they do on FreeBSD (lucene jenkins). Here's the story.
I looked at Solr logs and saw this:
2> 1012153 T10 oejut.QueuedThreadPool.doStop WARN 4 threads could not be stopped
just before failures related to "socket/ port already bound" in SSLMigrationTest. QueuedThreadPool in jetty attempts to wait for pool threads, then terminates them (and waits again). This wait time is configurable, alas broken in Solr's code in JettySolrRunner:
private void init(String solrHome, String context, int port, boolean stopAtShutdown) { ... if (threadPool != null) { threadPool.setMaxThreads(10000); threadPool.setMaxIdleTimeMs(5000); threadPool.setMaxStopTimeMs(30000); }
The threadPool variable here is always null because it gets assigned after jetty starts and the configuration block is executed before it. the threadPool != null condition is never true and the code that configures those timeouts is dead.
That's not a biggie, I fixed it. The problem remains, however – even with a long wait time, the threads in accept() call are not interrupted. I wrote a small test class:
import java.net.InetSocketAddress; import java.nio.channels.ServerSocketChannel; public class Foo { public static void main(String[] args) throws Exception { final ServerSocketChannel ssc = ServerSocketChannel.open(); ssc.configureBlocking(true); ssc.socket().setReuseAddress(true); ssc.socket().bind(new InetSocketAddress(0), 20); System.out.println("Port: " + ssc.socket().getLocalPort()); Thread t = new Thread() { @Override public void run() { try { System.out.println("Thread accept();"); ssc.accept().close(); System.out.println("Done?"); } catch (Exception e) { System.out.println("Thread ex: " + e); } } }; t.start(); Thread.sleep(2000); t.interrupt(); Thread.sleep(1000); System.out.println(t.getState()); } }
If you run it on Windows, for example, here's the expected result:
Port: 666 Thread accept(); Thread ex: java.nio.channels.ClosedByInterruptException TERMINATED
Makes sense. On FreeBSD though, the result is:
Port: 32596
Thread accept();
RUNNABLE
Interestingly, the thread IS terminated after ctrl-c is pressed...
I think this is a showstopper since it violates the contract of accept(), which states:
ClosedByInterruptException - If another thread interrupts the current thread while the accept operation is in progress, thereby closing the channel and setting the current thread's interrupt status
Attachments
Issue Links
- relates to
-
LUCENE-5786 Unflushed/ truncated events file (hung testing subprocess)
- Resolved