Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2729

protocol-okhttp: fix marking of truncated content

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.15
    • Fix Version/s: 1.16
    • Component/s: plugin, protocol
    • Labels:
      None

      Description

      The plugin protocol-okhttp marks content as "truncated" including the reason for the truncation - content limit or time limit exceeded, network disconnect during fetch.

      The detection of truncation by content limit has one bug: if the fetched content is exactly the size of the content limit the loop to request more content is exited. It should be continued by requesting one byte more to reliably detect whether content is truncated or not.

      Note that the Content-Length header cannot be used to determine truncation reliably: it does not indicate the real content length for compressed or chunked content or it might be wrong.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                snagel Sebastian Nagel
                Reporter:
                snagel Sebastian Nagel
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: