Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2549 protocol-http does not behave the same as browsers
  3. NUTCH-2575

protocol-http does not respect the maximum content-size for chunked responses

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.14
    • Fix Version/s: 1.15
    • Component/s: protocol
    • Labels:
      None

      Description

      There is a bug in HttpResponse::readChunkedContent that prevents it to stop reading content when it exceeds the maximum allowed size.

      There is a variable contentBytesRead that is used to check how much content has been read, but it is never updated, so it always stays null, and the size check always returns false (unless a single chunk is larger than the maximum allowed content size).

      This allows any server to cause out-of-memory errors on our size.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                gbouchar Gerard Bouchar
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: