Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1352

Improve regex urlfilters/normalizers synchronization

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: nutchgora, 1.6
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      I noticed that during fetching a lot of the time the fetcherthreads are blocking on a monitor because of outlink normalizing/filtering. The cause of this: Some of the regex plugins use single lock synchronization.

      This patch improves throughput by removing synchronization locks and replace them with threadlocals were needed.

      It has been extensively tested in production. I will commit this later today when no objection.

        Attachments

        1. NUTCH-1352-1.6-1.patch
          15 kB
          Markus Jelsma
        2. NUTCH-1352.patch
          15 kB
          Ferdy

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ferdy.g Ferdy
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: