Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2549 protocol-http does not behave the same as browsers
  3. NUTCH-2575

protocol-http does not respect the maximum content-size for chunked responses

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.14
    • 1.15
    • protocol
    • None

    Description

      There is a bug in HttpResponse::readChunkedContent that prevents it to stop reading content when it exceeds the maximum allowed size.

      There is a variable contentBytesRead that is used to check how much content has been read, but it is never updated, so it always stays null, and the size check always returns false (unless a single chunk is larger than the maximum allowed content size).

      This allows any server to cause out-of-memory errors on our size.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gbouchar Gerard Bouchar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: