Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
1.4, 1.5
-
None
-
None
-
Patch Available
Description
For some reason some URL's always time out with protocol-http but not protocol-httpclient. The stack trace is always the same:
2012-04-20 11:25:44,275 ERROR http.Http - Failed to get protocol output java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.FilterInputStream.read(FilterInputStream.java:116) at java.io.PushbackInputStream.read(PushbackInputStream.java:169) at java.io.FilterInputStream.read(FilterInputStream.java:90) at org.apache.nutch.protocol.http.HttpResponse.readPlainContent(HttpResponse.java:228) at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:157) at org.apache.nutch.protocol.http.Http.getResponse(Http.java:64) at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
Some example URL's:
Attachments
Attachments
Issue Links
- is related to
-
NUTCH-1825 protocol-http may hang for certain web pages
- Closed