Bug 50044 - NIO HTTP connector aborts GET requests incorrectly.
Summary: NIO HTTP connector aborts GET requests incorrectly.
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 6
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 6.0.29
Hardware: PC Windows NT
: P2 normal (vote)
Target Milestone: default
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-04 21:39 UTC by goberman
Modified: 2010-11-21 15:09 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description goberman 2010-10-04 21:39:54 UTC
I use NIO HTTP Tomcat connector org.apache.coyote.Http11NioProtocol to implement Comet streaming to Android phones. 
The application opens 'primary' persistent Comet connection to receive data from the server. Server sends data to the client periodically. Client can terminate this GET by sending "disconnect" GET command to the server. The "disconnect" command completes server response and cleans up client connection state. Each client has a unique client Id to identify it.

So the "connect" GET looks like: 
/WebContentGateway/Controller?clientId=notimportant&command=connect

And disconenct looks like:
/WebContentGateway/Controller?clientId=notimportant&command=disconnect
The problem I uncovered is that "disconnect" GET somehow gets truncated and never invoked on the server side (looks like it is "lost"). I traced it with Wireshark and Tomcat does receive GET request, but just closes connection without invoking servlet/ sending GET response (the “disconnect” GET socket connection is obviously different from the “primary” connection).
I have both client and servlet code and can provide any support to reproduce the problem.
This problem did not exist with build 6.0.16, but can be replicated with the latest Tomcat 6 and 7. It is very important for me to resolve it -  NIO HTTP Tomcat does not look solid it this point.

Client code just send “connect” command followed by “disconnect”. There should not be any threading contention because I serialize sending commands in one thread. It hangs on readThread.join(); because "primary" thread does not terminate. Also, if I throttle the for loop by putting sleep, the problem does not occur, so it is some sort of a racing condition in Tomcat.

CLIENT CODE:

public class Test extends AndroidTestCase {
    private static final String CLIENT_ID_NOTIMPORTANT = "clientId=notimportant";
	private static String url = "http://172.16.16.33/WebContentGateway/Controller?";
	private static InputStream cometConnectionInputStream;
	private static Thread readThread;
    
    public void testConnectDisconnect() {
		for (int i = 0; i < 100; i++) {
	        try {
	            URL cometUrl = new URL(url + CLIENT_ID_NOTIMPORTANT + "&command=connect");
	            
	            HttpURLConnection cometConnection = (HttpURLConnection) cometUrl.openConnection();
	            cometConnection.setRequestMethod("GET");
	            cometConnection.setDoInput(true);
	            cometConnection.setUseCaches(false);
	            
	            cometConnection.setConnectTimeout(10000);
	            cometConnection.setReadTimeout(0);
	
	            cometConnectionInputStream = cometConnection.getInputStream();
	            final CountDownLatch readLatch = new CountDownLatch(1);
	            readThread = new Thread(new Runnable() {
					
					public void run() {
				        System.out.println("Started Connect GET thread.");
				        readLatch.countDown();
					    byte[] readConnectionReadBuffer = new byte[10 * 1024];
				        try {
				            while ((cometConnectionInputStream.read(readConnectionReadBuffer, 0, readConnectionReadBuffer.length)) >= 0) {
				            }
				        } 
				        catch (Throwable e) {
				        }
				        System.out.println("Connect GET finished.");
					}
				});
	            readThread.start();
	            readLatch.await();
	            
	            URL disconnectUrl = new URL(url + CLIENT_ID_NOTIMPORTANT + "&command=disconnect");
	            
		        System.out.println("Sending disconnect");
	            HttpURLConnection disconnectConnection = (HttpURLConnection) disconnectUrl.openConnection();
	            disconnectConnection.setRequestMethod("GET");
	            disconnectConnection.setDoInput(true);
	            disconnectConnection.setUseCaches(false);
	            
	            disconnectConnection.setConnectTimeout(10000);
	            disconnectConnection.setReadTimeout(0);
		        InputStream disconnectConnectionInputStream = disconnectConnection.getInputStream();
			    byte[] writeConnectionReadBuffer = new byte[10 * 1024];
				while ((disconnectConnectionInputStream.read(writeConnectionReadBuffer, 0, writeConnectionReadBuffer.length)) >= 0) {
		        }
		
		        disconnectConnectionInputStream.close();
		        
		        readThread.join();
	            
	        }
	        catch (Exception e) {
	        	e.printStackTrace();
	        }
		}
	}
}

SERVER CODE:
public class WebFrameworkServlet extends HttpServlet implements CometProcessor {
	private Map<String, CometClientInfo> cometClientsInfoMap = new ConcurrentHashMap<String, CometClientInfo>();
	private Map<HttpServletResponse, String> cometResponsesMap = new ConcurrentHashMap<HttpServletResponse, String>();
	
	public void event(CometEvent event) throws IOException, ServletException {
		HttpServletRequest request = event.getHttpServletRequest();
		HttpServletResponse response = event.getHttpServletResponse();
		if (event.getEventType() == CometEvent.EventType.BEGIN) {
			if (request.getParameter(ParamNames.IN_PARAM_COMMAND).equals(ParamNames.COMMAND_NAME_CONNECT)) {
				
				String clientId = request.getParameter(ParamNames.COMMAND_LOGIN_CLIENT_ID);
				
				response.setHeader("pragma", "no-cache,no-store");
				response.setHeader("cache-control", "no-cache,no-store,max-age=0,max-stale=0");
				
				event.setTimeout(Integer.MAX_VALUE);
				
				Client client = new Client(clientId, null, null);
				
				CometClientInfo cometClientInfo = new CometClientInfo(client, response, 0, event);
				cometClientsInfoMap.put(client.getClientId(), cometClientInfo);
				cometResponsesMap.put(response, client.getClientId());
				
				PrintWriter out = response.getWriter();
				out.print("{\"statusCode\" : 0}");
				out.flush();
			}
			if (request.getParameter(ParamNames.IN_PARAM_COMMAND).equals(ParamNames.COMMAND_NAME_DISCONNECT)) {
				String clientId = request.getParameter(ParamNames.COMMAND_LOGIN_CLIENT_ID);
				CometClientInfo existingClientInfo = cometClientsInfoMap.remove(clientId);
				response.setContentType("text/json");
				PrintWriter out = response.getWriter();
				out.print("{\"statusCode\" : 0}");
				out.close();
				
				if (existingClientInfo != null) {
					cometResponsesMap.remove(existingClientInfo.getResponse());
					existingClientInfo.getWriter().close();
					existingClientInfo.getEvent().close();
				}
			}
		} 
		else if (event.getEventType() == CometEvent.EventType.ERROR) {
			event.close();
			
			String clientId = cometResponsesMap.remove(response);
			if (clientId != null) {
				CometClientInfo cometClientInfo = cometClientsInfoMap.remove(clientId);
				if (cometClientInfo != null) {
					cometClientInfo.getClient().setDisconnectTime(System.currentTimeMillis());
				}
			}
		} 
		else if (event.getEventType() == CometEvent.EventType.END) {
			event.close();
			
			String clientId = cometResponsesMap.remove(response);
			if (clientId != null) {
				CometClientInfo cometClientInfo = cometClientsInfoMap.remove(clientId);
				if (cometClientInfo != null) {
					cometClientInfo.getClient().setDisconnectTime(System.currentTimeMillis());
				}
			}
		}
	}
}
Comment 1 Mark Thomas 2010-10-08 11:01:14 UTC
The provided test case is incomplete. I had to reverse engineer a handful of support classes.

Testing with the latest 7.0.x code shows no issues. I don't see any hanging on readThread.join(). The connectors in 7.0.x recently went through a major refactoring that addressed some async threading issues and may have fixed the root cause here too.

Next up, testing with the latest 6.0.x...
Comment 2 Mark Thomas 2010-10-08 11:15:30 UTC
I do see problems running this on Tomcat 6. Investigating now...
Comment 3 Mark Thomas 2010-10-08 13:03:33 UTC
No joy tracking this down yet.

Also, Tomcat 7 seems to be adding a 1s delay to each request.
Comment 4 Mark Thomas 2010-10-10 16:36:28 UTC
Similar issues with the APR connector in 6.0.x

The delay in Tomcat 7 is the selectorTimeout.

Still investigating...
Comment 5 Mark Thomas 2010-10-10 18:11:05 UTC
I've tracked down and fixed the issue in the APR/native connector for Tomcat 7. The fix will be in 7.0.4 onwards.
Comment 6 Mark Thomas 2010-11-21 15:09:46 UTC
Best I can tell, the 'lost' GET is triggered by bug 50072. Therefore, I am resolving this as fixed. I'm not marked it as a duplicate as there were some 7.0.x issues that were fixed under this bug reference.