Bug 56391 - test error for NIO and org.apache.tomcat.util.net.TestSsl
Summary: test error for NIO and org.apache.tomcat.util.net.TestSsl
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: Util (show other bugs)
Version: 8.0.5
Hardware: Sun Solaris
: P2 normal (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-11 13:11 UTC by Petr Sumbera
Modified: 2014-04-23 13:44 UTC (History)
0 users



Attachments
test log (7.64 KB, text/plain)
2014-04-11 13:11 UTC, Petr Sumbera
Details
test log (r1589303, Win7) (7.94 KB, text/plain)
2014-04-22 23:07 UTC, Konstantin Kolinko
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Petr Sumbera 2014-04-11 13:11:18 UTC
Created attachment 31514 [details]
test log

While this is the only error I see in whole Tomcat test suite it might be real issue. Please see attached report.

java version "1.7.0_51"
Comment 1 Konstantin Kolinko 2014-04-11 13:14:51 UTC
JDK version = ?
Comment 2 Petr Sumbera 2014-04-11 13:22:41 UTC
jdk1.7.0
Comment 3 Chuck Caldarale 2014-04-11 13:33:02 UTC
(In reply to Petr Sumbera from comment #2)
> jdk1.7.0

The exact version please (e.g., 1.7.0_25).
Comment 4 Petr Sumbera 2014-04-11 13:38:56 UTC
$ /usr/jdk/instances/jdk1.7.0/bin/java -version
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b34)
Java HotSpot(TM) Server VM (build 24.51-b04, mixed mode)
Comment 5 Mark Thomas 2014-04-22 15:57:41 UTC
Is this repeatable? It is possible that the test behaves differently on Solaris and we simply don't handle it.
Comment 6 Rainer Jung 2014-04-22 16:44:50 UTC
Not here. Just ran that test 10 times for Tomcat 8 trunk with NIO, BIO and APR each on Solaris 10 Sparc using Java 1.7.0_51. No failures.

Petr: Can you reproduce?
Comment 7 Remy Maucherat 2014-04-22 16:59:57 UTC
I could randomly reproduce it (on Linux), same stack trace. But it is odd, it does not happen with NIO2 for me, and I don't see why it wouldn't (the SSL code is the same use of SSL engine). I can't reproduce it if I run as a single test, but I can (one out of four times, I'd say) if running the full TestSsl series.

Minor issue IMO since this doesn't happen much and this renegotiation test is just protocol abuse anyway.
Comment 8 Mark Thomas 2014-04-22 19:16:27 UTC
I can't reproduce this on Windows or OSX (tried 10+ runs) either.
Comment 9 Mark Thomas 2014-04-22 22:28:27 UTC
I've re-written the unit test. It still passes on Windows but now fails much more frequently on Windows. Getting to the bottom of why it fails is the next task.
Comment 10 Konstantin Kolinko 2014-04-22 22:39:25 UTC
With trunk at r1589300 the test now fails for me consistently in 10 of 10 runs on Windows 7 + JDK 7u55 32-bit.

I am running with

execute.validate=false
execute.test.bio=false
execute.test.nio=true
execute.test.apr=false
execute.test.nio2=false
test.accesslog=false
test.entry=org.apache.tomcat.util.net.TestSsl
test.entry.methods=testSimpleSsl,testKeyPass,testRenegotiateWorks,testRenegotiateFail

The last property is to fix order of the methods.

All 10 failures are with the same "72" vs "-1" failure:
[[[
Testcase: testSimpleSsl took 4,581 sec
Testcase: testKeyPass took 0,926 sec
Testcase: testRenegotiateWorks took 3,418 sec
        FAILED
expected:<72> but was:<-1>
junit.framework.AssertionFailedError: expected:<72> but was:<-1>
        at org.apache.tomcat.util.net.TestSsl.doRequest(TestSsl.java:158)
        at org.apache.tomcat.util.net.TestSsl.testRenegotiateWorks(TestSsl.java:135)
]]]

Apparently there is no more "testRenegotiateFail" test, so only 3 tests methods were run.
Comment 11 Remy Maucherat 2014-04-22 22:50:08 UTC
On Linux it fails 100% of the time too now (it used to be about 50%). NIO2 still does not fail for whatever reason. I can investigate tomorrow if needed.

My trace is on Linux is:
Testcase: testRenegotiateWorks took 3.216 sec
	FAILED
Failed on request number 1 after startHandshake()
junit.framework.AssertionFailedError: Failed on request number 1 after startHandshake()
	at org.apache.tomcat.util.net.TestSsl.testRenegotiateWorks(TestSsl.java:138)

Yes, testRenegotiateFail is gone, it is not needed as trunk requires Java 7, according to the commit.

BTW http://ci.apache.org/builders/tomcat-trunk is up again, but not running anything.
Comment 12 Konstantin Kolinko 2014-04-22 23:07:38 UTC
Created attachment 31550 [details]
test log (r1589303, Win7)

Test log on Windows with trunk at r1589303.

Failed on request number 2 after startHandshake(). expected:<72> but was:<-1>
Comment 13 Mark Thomas 2014-04-22 23:13:14 UTC
I have been able to make a little progress debugging this. I still have not been able to reproduce this on Windows.

1. The issue appears to be timing related. On OSX I see the error when running from the command line but not when running through Eclipse.

2. It looks like the read() and write() methods in SecureNioChannel make assumptions about the return value of tasks() that are not correct if the client has triggered renegotiation.
Comment 14 Remy Maucherat 2014-04-23 08:46:28 UTC
From my debugging, in the test it is never NEED_TASK, so tasks() is not called (and it doesn't feel wrong).

However, (tracing the write, since that is what the test is doing) the handshake status goes into NEED_WRAP and NEED_UNWRAP, which means the write should read. Not a good concept IMO ...

More specifically about NIO2, this could be very difficult if not impossible to do it "right" (depending on what ends up being needed), since mixing read and write operations after the initial handshake would corrupt data or run into pending checks. This could be a candidate to end up as a known issue.
Comment 15 Mark Thomas 2014-04-23 13:44:07 UTC
I have disabled this test for NIO (it was already disabled for APR) as the improved unit test has demonstrated that the current implementation is not reliable.

I have created bug 56448 to track implementing improved renegotiation support.