Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1089

short compressed pages caused Exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • Patch Available

    Description

      Hi,

      tested nutch on compressed pages, and on pages with Basic Auth and compression. On short compressed pages this Exception is thrown:

      2011-08-19 17:06:55,190 ERROR httpclient.Http - java.io.IOException: unzipBestEffort returned null
      2011-08-19 17:06:55,190 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded(HttpBase.java:310)
      2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:163)
      2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:154)
      2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:138)
      2011-08-19 17:06:55,191 ERROR httpclient.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)

      In same cases Basic Auth failt also.

      Works fine with the patch.

      Attachments

        1. HttpResponsePatch.patch
          0.8 kB
          simone frenzel

        Issue Links

          Activity

            People

              Unassigned Unassigned
              psimone simone frenzel
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: