Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-374

when http.content.limit be set to -1 and Response.CONTENT_ENCODING is gzip or x-gzip , it can not fetch any thing.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8, 0.8.1
    • 0.9.0
    • None
    • None

    Description

      I set "http.content.limit" to -1 to not truncate content being fetched.
      However , if response used gzip or x-gzip , then it was not able to uncompress.

      I found the problem is in HttpBase.processGzipEncoded (plugin lib-http)
      ...
      byte[] content = GZIPUtils.unzipBestEffort(compressed, getMaxContent());
      ...
      because it is not deal with -1 to no limit , so must modify code to solve it;

      byte[] content;
      if (getMaxContent()>=0)

      { content = GZIPUtils.unzipBestEffort(compressed, getMaxContent()); }

      else

      { content = GZIPUtils.unzipBestEffort(compressed); }

      Attachments

        Activity

          People

            pkosiorowski Piotr Kosiorowski
            chinawab King Kong
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: