Tomcat version: 6.0.13 (no 6.0.13 available in the version field) When trying to process data in CometProcessor.event() either during the BEGIN event or during the READ event, it is impossible to guarantee that all data has been read before returning (since new data may arrive between the last statement before the return, and the return statement itself). However, the CoyoteProcessor does this check: } else if (!error && read && request.getAvailable()) { // If this was a read and not all bytes have been read, or if no data // was read from the connector, then it is an error error = true; log.error(sm.getString("coyoteAdapter.read")); } causing a severe error and the Comet request to break when this race condition manifests itself. I am using the following code according to the aio.html description to read data in my comet event handler: while (request.getInputStream().available() > 0) { // read some data } // ***** NO MORE DATA AVAILABLE return; And I am experiencing the error as described in certain conditions (rapid arrival of fragmented data): SEVERE: The servlet did not read all available bytes during the processing of the read event
available() does not make any socket access to see if bytes have arrived, so the scenario you describe cannot happen (if available returns 0, it will continue returning 0 until the next read event). If the message you saw is logged, CometAdapter.event will return false, which will lead to closing the connection, not causing a loop. My only theory is that you're doing asynchronous reads, which is not allowed, and you'll need to look into this further. What does "(no 6.0.13 available in the version field)" mean ?
I have very explicit logging and I'm 100% certain that I'm not doing any async reads. In fact, from the log I sent to the user list before reporting this as a bug you can see that everything is happening in 1 thread, namely http-8080-exec-4: 2007-06-26 14:37:08,427 DEBUG [http-8080-exec-4] com.sebster.myservlet.TomcatCometServlet - BEGIN event for request org.apache.catalina.connector.RequestFacade@d47c99 2007-06-26 14:37:08,427 DEBUG [http-8080-exec-4] com.sebster.myservlet.TomcatCometServlet - 127.0.0.1:60578 POST /mycometservlet 2007-06-26 14:37:08,432 DEBUG [http-8080-exec-4] com.sebster.myservlet.TomcatCometServlet - [24224039-a37e-40d0-a076-89d1df363390] read loop in BEGIN event, input stream data available: 1 2007-06-26 14:37:08,438 DEBUG [http-8080-exec-4] com.sebster.myservlet.TomcatCometServlet - [24224039-a37e-40d0-a076-89d1df363390] read loop done, input stream data available: 0 2007-06-26 2:37:08.MD org.apache.catalina.connector.CoyoteAdapter event SEVERE: The servlet did not read all available bytes during the processing of the read event Note that I'm not implying that available() does a socket access; what I saw was that when the loop I described in the original post ended the condition "available() == 0" was true, and when it got to CatalinaAdapter the condition Request.getAvailable() was true. Futhermore, I really am seeing the loop without the connection being closed, both on the windows platform and on the Linux platform. I'm happy to help debug this further, but I'm not making it up: I have a breakpoint in my CometProcessor.event() method, I see the Poller thread spin like crazy, and when I set a break point in the Poller the CPU load goes away. I can step through it and see that the connection stays open, it sets the event type to END on the comet event, but it never reaches my CometProcessor.event() method. Finally, what I mean with "no 6.0.13 available in the version field" is the dropdown box in which to mark the Tomcat version in the bugzilla bug form. It goes up to and including 6.0.11. Regards, Sebastiaan
Since you're apparently not willing to write things that make sense, I will ignore your report.
(In reply to comment #3) > Since you're apparently not willing to write things that make sense, I will > ignore your report. I am getting really exasperated by your replies. You give me no hint at what it is you want or what you think is not making sense. I will test your theory more by writing a wrapper around the input stream to log the thread of every read that happens just to make sure tomorrow. In either case, after I got the error the Poller went into a busy loop. I'm not smoking pot and I saw what I saw. Ok, maybe there's a bug somewhere in my code and I'm doing something wrong, I certainly don't want to exclude that possibility. But I saw the Poller go into an infinite loop (not caused by the NIO bug I was talking about previously (which is Linux only)). It did NOT close the channel. It did NOT call my event method with a READ, ERROR or END event (which I know because I have a log.debug on enter, AND a break point set). This happened directly AFTER I got the SEVERE log message. I got this on Windows and Linux. Note that I'm not theorizing now, these are the raw observations.
If you follow the code logic, if the error happens, where available() returns >0 after a READ has been invoked, the connection is never marked as a comet, so you will not get the ERROR/END events invoked. so the connection goes into the poller for the next request. Could you see if you can work up a reproducible scenario, and we can take it from there.
(In reply to comment #5) > If you follow the code logic, if the error happens, where available() returns >0 > after a READ has been invoked, the connection is never marked as a comet, so you > will not get the ERROR/END events invoked. so the connection goes into the > poller for the next request. > Could you see if you can work up a reproducible scenario, and we can take it > from there. When I look in the code I see the following: if (response.isClosed() || !request.isComet()) { res.action(ActionCode.ACTION_COMET_END, null); } else if (!error && read && request.getAvailable()) { // If this was a read and not all bytes have been read, or if no data // was read from the connector, then it is an error error = true; log.error(sm.getString("coyoteAdapter.read")); } return (!error); So when you get the error, the condition response.isClosed() || !request.isComet() must evaluate to false, which means that !response.isClosed() && request.isComet(). So for me to get the error it seems the request must already be marked as a comet request. I am not able to give you a small test case. The problem is that I have two different products that start an embedded tomcat using exactly the same class, and in one it consistently works (no errors whatsoever), and in the other it consistently fails. I have not seen this issue before (and I have already done a lot of testing in different situations), and I have no idea what the precondition is for this issue to manifest itself. However, I added extra debugging information, and I did find something strange. I do the following logging FIRST thing in my Comet processor and I wrapped the input stream to log which thread does the reading: final EventType eventType = event.getEventType(); if (Debug.ENABLED && logger.isDebugEnabled()) { try { final Field field = request.getClass().getDeclaredField("request"); //$NON-NLS-1$ field.setAccessible(true); final Request internalRequest = (Request) field.get(request); logger.debug("{} event for request facade {}, request {}", new Object[] { eventType, request, internalRequest }); //$NON-NLS-1$ } catch (final Exception e) { logger.debug("{} event for request facade {}", eventType, request); //$NON-NLS-1$ logger.debug("failed to get request", e); //$NON-NLS-1$ } } I set a breakpoint in the CoyoteAdapter where it gives the SEVERE error, looked at the id of the request object, and checked my log what events I got for that object. My results where as follows: 2007-06-28 11:16:00,021 DEBUG [http-8080-exec-4] com.sebster.myservlet.MyCometServlet - BEGIN event for request facade org.apache.catalina.connector.RequestFacade@1dc696e, request org.apache.catalina.connector.Request@18dbef1 This was the ONLY line in the log with Request@18dbef1. All the reads in this BEGIN event happen on the http-8080-exec-4 thread. After returning from the read (due to available() == 0), it immediately stops in my breakpoint in CoyoteAdapter to log the SEVERE error. Note that I first set the break point to find the id of the Request object to look for in the log. The strange thing is, that the breakpoint happens in the http-8080-exec-5 thread. The comet field in the Request@1dc696e object is set to true, and request.getAvailable() also returns true. These are my observations so far. Note that I am no expert on Tomcat internals, but I am very willing to spend time on this issue to resolve it, so any tips, suggestions, questions, or assignments :-) are welcome! To summarize: 1) I get a BEGIN event for Request@1dc696e in http-8080-exec-4 2) I read until available() == 0 in http-8080-exec-4 3) I return from the event() method in http-8080-exec-4 4) Tomcat logs a SEVERE error stating that I have not read all data * The request object is Request@1dc696e * request.isComet() == true * request.getAvailable() == true * the thread is http-8080-exec-5 * at this point there are zero log lines for thread http-8080-exec-5 Regards, Sebastiaan
I did some additional debugging on the Poller loop that occurs after the SEVERE error. This is what happens: 1) the CoyoteAdapter.event() method gets called. 2) request.read() returns false 3) the condition status == SocketStatus.STOP evaluates to true 4) the following statements are executed request.getEvent().setEventType(CometEvent.EventType.END); request.getEvent().setEventSubType(CometEvent.EventSubType.SERVER_SHUTDOWN); 5) in the following statement getFirst() returns the "basic" field, a StandardEngineValve: connector.getContainer().getPipeline().getFirst().event(request, response, request.getEvent()); 6) my CometProcessor.event() method is NOT called 7) error = false, and the event() method returns true This loop repeats ad infinitum, and consumes 100% CPU. Note that no code of my own is called in this sequence. Regards, Sebastiaan
I just wrote a simple test client that doesn't read the contents and I experienced no loop. END nor ERROR got called, so there is a problem here, the container should call one of those methods. are you able to test your solution against the latest tc6.0.x branch as well as the trunk branch?
(In reply to comment #8) > I just wrote a simple test client that doesn't read the contents and I > experienced no loop. END nor ERROR got called, so there is a problem here, the > container should call one of those methods. > are you able to test your solution against the latest > tc6.0.x branch as well as the trunk branch? A patch for the END/ERROR never being called has been submitted to both 6.0.x and trunk. This should at least give you a chance to cleanup, your original problem of available()==0 but there yet being data, causing an infinite loop is to me non reproducible. I'll wait for your feedback before marking this fixed.
(In reply to comment #9) > A patch for the END/ERROR never being called has been submitted to both 6.0.x > and trunk. This should at least give you a chance to cleanup, Thanks :-), I will test this ASAP. > your original problem of available()==0 but there yet being data, causing an > infinite loop is to me non reproducible. The main clue (as far as I can tell) of something going wrong is the fact that the BEGIN event + read loop is in 1 thread, and the error in happens another (see the summary of comment 6). Do you not agree that this should not happen? If you agree that this is wrong (whatever the cause, Tomcat or my code), I can try and debug what causes this to happen. Regards, Sebastiaan
Once the socket goes back to the poller, and the poller dispatches it again, then that can be on a different thread, as tomcat uses a thread pool, so there is no guarantee what thread the poller dispatches too
(In reply to comment #11) > Once the socket goes back to the poller, and the poller dispatches it again, > then that can be on a different thread, as tomcat uses a thread pool, so there > is no guarantee what thread the poller dispatches too Yes I understand that, but for the specific Request object instance I get only 1 comet event, in thread http-8080-exec-4 and the check of Request.getAvailable() to see if I read everything happens in http-8080-exec-5 (with no other comet events in between - I ONLY get the BEGIN event for this Request instance). Is this not strange? Should the check not be performed in the same thread before the socket goes back to the poller?
not really, getAvailable() is only called if read() returned true. in your case, the following might happen 1. HTTP request comes in, no body 2. You get BEGIN event, no additional data on the socket 3. read() returns false, so you dont get an immediate READ event,
I have found the cause of the problem. The code place where it goes wrong is ApplicationFilterFactory line 126: Request req = (Request) request; if (Globals.IS_SECURITY_ENABLED) { // Security: Do not recycle filterChain = new ApplicationFilterChain(); } else { filterChain = (ApplicationFilterChain) req.getFilterChain(); if (filterChain == null) { filterChain = new ApplicationFilterChain(); req.setFilterChain(filterChain); } } comet = req.isComet(); When a security manager is installed the filter chain is not set on the request. On subsequent Comet events, since the filter chain is null, the event is not handled properly and the poller goes into a busy loop.
Excellent, we will fix this!
The "issue" in this bug report is that the user claims that the value of "available()" randomly changes. This minor glitch is unrelated.
(In reply to comment #16) > The "issue" in this bug report is that the user claims that the value of > "available()" randomly changes. This minor glitch is unrelated. The issue was that Tomcat reported that I had not read all available bytes even though I did. The reason turns out to be because Tomcat was not calling the event method of my comet processor, which entails that indeed bytes were not being read. My original issue is thereby resolved and the summary of this bug turns out to be wrong. It just looked to me like there was a race condition, since I had a breakpoint in event() and it was not being called anymore, so I thought (erroneously) that it was complaining at the end of the BEGIN event. So, as far as I'm concerned this issue can be closed. I don't care if you mark it INVALID if that's what you want because the summary turns out to be incorrect; that's fine by me. I'm happy that the issue is resolved and the bug is fixed. :-)
(In reply to comment #15) > Excellent, we will fix this! Thanks a lot. :-) Sorry for my initially incorrect analysis of the problem. I should stick to reporting observed behavior instead of theorizing.