Nutch
  1. Nutch
  2. NUTCH-1057

Make fetcher thread time out configurable

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4
    • Component/s: fetcher
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      The fetcher sets a time out value based of half the mapred.task.timeout value. This is not a proper value for all cases. Add an option (fetcher.thread.timeout.divisor) to configure the divisor used and default it to two.

        Activity

        Hide
        Markus Jelsma added a comment -

        Patch for 1.4. There's also a diff for NUTCH-1037 in the config file which hasn't been committed yet.

        Show
        Markus Jelsma added a comment - Patch for 1.4. There's also a diff for NUTCH-1037 in the config file which hasn't been committed yet.
        Hide
        Markus Jelsma added a comment -

        Any comments or objections? Any better methods?

        Show
        Markus Jelsma added a comment - Any comments or objections? Any better methods?
        Hide
        Julien Nioche added a comment -

        Apart from the part related to NUTCH-1037 which will need removing unless it is committed your patch looks OK.
        Just to make sure I understand what the issue is : a hadoop fetch task can fail because the timeout for fetch threads is too long. Is that right?

        Show
        Julien Nioche added a comment - Apart from the part related to NUTCH-1037 which will need removing unless it is committed your patch looks OK. Just to make sure I understand what the issue is : a hadoop fetch task can fail because the timeout for fetch threads is too long. Is that right?
        Hide
        Markus Jelsma added a comment -

        No. This is a tuning option for users that experience very long pauses in the merge phase after a map finishes. It takes long because there are many GB's of map output and/or slow IO.

        To prevent the task tracker from killing the merge (default 600s time out) users need to raise the mapred.timeout value to a value higher than the actual duration of the merge phase.

        Fetcher threads have a time out that is configured to be half the tasktracker time out value. This means that with a high (e.g. 20m) task timeout, the fetcher will wait 10m before killing hanging threads. This is a waste of time. In large crawl there are always a few threads unable to finish properly. Killing them sooner makes the merge begin earlier.

        Sorry if i was unclear before.

        Show
        Markus Jelsma added a comment - No. This is a tuning option for users that experience very long pauses in the merge phase after a map finishes. It takes long because there are many GB's of map output and/or slow IO. To prevent the task tracker from killing the merge (default 600s time out) users need to raise the mapred.timeout value to a value higher than the actual duration of the merge phase. Fetcher threads have a time out that is configured to be half the tasktracker time out value. This means that with a high (e.g. 20m) task timeout, the fetcher will wait 10m before killing hanging threads. This is a waste of time. In large crawl there are always a few threads unable to finish properly. Killing them sooner makes the merge begin earlier. Sorry if i was unclear before.
        Hide
        Markus Jelsma added a comment -

        Committed for 1.4 in rev. 1148406.

        Show
        Markus Jelsma added a comment - Committed for 1.4 in rev. 1148406.
        Hide
        Markus Jelsma added a comment -

        I'd like to commit this issue this friday unless there are objections or other comments.

        Show
        Markus Jelsma added a comment - I'd like to commit this issue this friday unless there are objections or other comments.
        Hide
        Julien Nioche added a comment -

        Haven't you committed it already? Or do you mean for trunk?

        Show
        Julien Nioche added a comment - Haven't you committed it already? Or do you mean for trunk?
        Hide
        Markus Jelsma added a comment -

        Sorry, wrong issue!

        Show
        Markus Jelsma added a comment - Sorry, wrong issue!
        Hide
        Markus Jelsma added a comment -

        Resolved for 1.4, see NUTCH-1104 for 2.0

        Show
        Markus Jelsma added a comment - Resolved for 1.4, see NUTCH-1104 for 2.0
        Hide
        Markus Jelsma added a comment -

        Bulk close of resolved issues of 1.4. bulkclose-1.4-20111220

        Show
        Markus Jelsma added a comment - Bulk close of resolved issues of 1.4. bulkclose-1.4-20111220

          People

          • Assignee:
            Markus Jelsma
            Reporter:
            Markus Jelsma
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development