Nutch
  1. Nutch
  2. NUTCH-1297

it is better for fetchItemQueues to select items from greater queues first

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.4
    • Fix Version/s: 1.9
    • Component/s: fetcher
    • Labels:
    • Patch Info:
      Patch Available

      Description

      there is a situation that if we have multiple hosts in fetch, and size of hosts were different, large hosts have a long delay until the getFetchItem() in FetchItemQueues class select a url from them, so we can give them more priority.
      for example if we have 10 url from host1 and 1000 url from host2, and have 5 threads, if all threads first selected from host1, we had more delay on fetch rather than a situation that threads first selected from host2, and when host 2 was busy, then selected from host1.

      1. NUTCH-1297.patch
        3 kB
        behnam nikbakht

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              behnam nikbakht
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development