Summary: | Server incorrectly parses incoming HTTP requests and loses them | ||
---|---|---|---|
Product: | Tomcat 7 | Reporter: | Edward Alexandrov <stfighter> |
Component: | Catalina | Assignee: | Tomcat Developers Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | critical | ||
Priority: | P2 | ||
Version: | 7.0.21 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: |
Screenshots from my debugger
Test case war file |
Description
Edward Alexandrov
2011-09-23 11:55:42 UTC
Created attachment 27576 [details]
Screenshots from my debugger
The relevant objects (Processor, Request, Response) are always recycled between requests. From the description of the problem, it sounds like something is keeping a reference to one or more of these between requests and that is triggering the mix-up. To investigate this further you'll need to provide the simplest possible web application (or unit test) that demonstrates this issue. Without the ability to re-create the issue there is little the Tomcat team can do. Also, the majority of issues of this nature turn out to be application issues so without the ability to recreate the problem, the assumption will be that this is also an application issue and the issue closed as invalid. Created attachment 27610 [details]
Test case
This is the test servlet which reproduces the issue. It can be used as follows: run the servlet under debugger and set breakpoint on TransportServlet.java: 102. Access it in Chrome in a tab. Click 'Reflect message' a couple of times. Open another tab (not closing the first one) and access the servlet. The breakpoint should trigger, indicating the broken request.
Ideally, you should see Error:204 in the first tab, after which the 'Reflect message' button takes no action, and working app in the second tab.
When I deploy my server on Galssfish it works as intended.
Testcase included Thanks for that, I'll take a look now. A couple of notes on the sample: - The class name is TransportHandler not TransportServlet - The <web-app> element in web.xml isn't quite right and Tomcat complains if validation is enabled - There is an invalid <comet-support-enabled> tag that Tomcat complains about if validation is enabled - The <init-param> tags are too late in the <servlet> block and Tomcat complains if validation is enabled I have tested the supplied test case and am unable to recreate the issue. The test environment was: OS: Windows 64-bit Java: 1.6.0_27 64-bit Chrome: 14.0.835.186 m Tomcat: 7.0.21 with NIO and APR, trunk with NIO In order to look at this further you are going to have to provide exact steps that recreate this issue on a clean install of either the latest Tomcat release or the latest code from svn trunk. It may also help if you provide a complete WAR file to ensure commonality between environments. The issue reproduces 100% on all Tomcat installations I have (7.0.21 on my Windows 32-bit machine, 7.0.19 on Ubuntu Server 32-bit test server and 7.0.16 on my test VM). On all of them Tomcat is cleanly installed, and in server.xml default HTTP1.1 connector changed to <Connector port="8080" protocol="org.apache.coyote.http11.Http11NioProtocol" connectionTimeout="20000" redirectPort="8443" /> Also, management user is added. All other settings are set to their defaults. Do you run the sample under the debugger? If not, it works because my workaround method validateRequest() which drops invalid requests, it shouldn't be in the production code. Even with this, you can encounter the delayed request processing, which is highly undesirable. The point of the sample I provided is to put a breakpoint on TransportHandler.java:102 and inspect HttpServletRequest.coyoteRequest object, and see incorrect headers, as in screenshots I attached earlier. I will also comment out this workaround method and attach the war, you will see that both connections and both tabs are closed (error 204), which should never happen. So, to reproduce the issue, you need to make Tomcat clean install, reconfigure Connector in server.xml, open the project in IDE (we use IntelliJ IDEA for development, project files are included), and set the breakpoint I mentioned. Then, follow the steps I described earlier. Thanks for your comments to the sample. THe class is TransportHandler because it is not the servlet class itself (which is AtmosphereServlet, built-in Atmosphere library), but Atmosphere handler. I will also validate the web.xml, however it is generated almost automatically by my IDE. Created attachment 27620 [details]
war file
Please also note the message flow. Normally, when opening the page, you should see 3 messages under 'Received messages'
1. State: messageReceived, Transport: polling, Body: { "success": true}
2. Pushing body: { "command": "player/reauth", "params": { "token": "ABC" } }
3. State: messageReceived, Transport: polling, Body: { "command": "system/hello" ...
And when you open the second tab, when the issue arisies you will see only two of them
1. Pushing body: { "command": "player/reauth", "params": { "token": "ABC" } }
2. State: messageReceived, Transport: polling, Body: { "command": "system/hello"
This is because incoming POST (for Pushing body: { "command": "player/reauth") is "swallowed" by Tomcat and presented to TransportHandler as GET with duplicate headers. And if you wait about 20-30sec, you will see the third line to appear, maybe because of the request is handled for the second time or something after the timeout.
Can you test with 7.0.22 release candidate? The download link is in the VOTE thread on dev@. Thanks for the additional detail. I can see the problem now. It dies reproduce with the 7.0.22 tag. I am currently trying to track down the root cause. This took a while to track down but was eventually relatively simple. When using NIO and a comet request is closed from a different thread then the poller is used to trigger the processing of the close event. The problem was that that request was marked as non-comet before this poller task was triggered meaning that when it happened the request was treated as a non-Comet request rather than a comet request. This meant that the necessary clean-up was not make and then next request received on the connection (i.e. from the same client) was corrupted. This has been fixed - along with some additional clean-up - in trunk and 7.0.x and will be included in 7.0.23 onwards. Thanks again for the test case. Bugs that involve multiple threads are always tricky to track down but the test case made it a lot easier. |