Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Cannot Reproduce
-
1.4
-
None
-
software
-
Patch Available
Description
it is a problem with some of web pages that fetched but their content can not retrived
after this change, this error fixed
we change lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
public byte[] processDeflateEncoded(byte[] compressed, URL url) throws IOException {
if (LOGGER.isTraceEnabled())
{ LOGGER.trace("inflating...."); } byte[] content = DeflateUtils.inflateBestEffort(compressed, getMaxContent());
+ if(content==null)
+ content = DeflateUtils.inflateBestEffort(compressed, 200000);
if (content == null)
throw new IOException("inflateBestEffort returned null");
if (LOGGER.isTraceEnabled())
{ LOGGER.trace("fetched " + compressed.length + " bytes of compressed content (expanded to " + content.length + " bytes) from " + url); } return content;
}
Attachments
Attachments
Issue Links
- is related to
-
NUTCH-1736 Can't fetch page if http response header contains Transfer-Encoding:chunked
- Closed