Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2699

Protocol-okhttp: needless loops to increment requested bytes counter when more content is already buffered

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.15
    • 1.16
    • protocol
    • None
    • Patch Available

    Description

      The okhttp library used by the plugin protocol-okhttp buffers content internal and often has already buffered more content than has been requested. The plugin should immediately set the request count to the size of the buffered content to avoid needless loops when the buffered size comes close to the content limit (the increment steps are too small):

      2019-03-11 14:56:36,642 DEBUG okhttp.OkHttpResponse - http://localhost/large.pdf - http/1.1 200 OK
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 8192, buffered = 16088
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 16384, buffered = 24280
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 24576, buffered = 32472
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 32768, buffered = 40664
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 40960, buffered = 48856
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 49152, buffered = 57048
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 57344, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 57638, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 57932, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 58226, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 58520, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 58814, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 59108, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 59402, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 59696, buffered = 65240
      2019-03-11 14:56:36,643 DEBUG okhttp.OkHttpResponse - total bytes requested = 59990, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 60284, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 60578, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 60872, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 61166, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 61460, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 61754, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 62048, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 62342, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 62636, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 62930, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 63224, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 63518, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 63812, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 64106, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 64400, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 64694, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 64988, buffered = 65240
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - total bytes requested = 65282, buffered = 73432
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - content limit reached
      2019-03-11 14:56:36,644 DEBUG okhttp.OkHttpResponse - copied 65534 bytes out of 73432 buffered, remaining buffer contains 7898 bytes
      2019-03-11 14:56:36,645 DEBUG okhttp.OkHttpResponse - HTTP content truncated to 65534 bytes (reason: LENGTH)
      2019-03-11 14:56:36,661 INFO parse.ParseSegment - http://localhost/large.pdf skipped. Content of size 366578 was truncated to 65534
      2019-03-11 14:56:36,661 WARN parse.ParserChecker - Content is truncated, parse may fail!
      

      Attachments

        Issue Links

          Activity

            People

              snagel Sebastian Nagel
              snagel Sebastian Nagel
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: