Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-207

Bandwidth target for fetcher rather than a thread count

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8
    • 1.9
    • fetcher
    • None
    • Patch Available

    Description

      Increases or decreases the number of threads from the starting value (fetcher.threads.fetch) up to a maximum (fetcher.threads.maximum) to achieve a target bandwidth (fetcher.threads.bandwidth).

      It seems to be able to keep within 10% of the target bandwidth even when large numbers of errors are found or when a number of large pages is run across.

      To achieve more accurate tracking Nutch should keep track of protocol overhead as well as the volume of pages downloaded.

      Attachments

        1. NUTCH-207.trunk.patch
          7 kB
          Julien Nioche
        2. NUTCH-207.trunk.v2.patch
          7 kB
          Julien Nioche
        3. ratelimit.patch
          8 kB
          Rod Taylor

        Activity

          People

            jnioche Julien Nioche
            rbt Rod Taylor
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: