Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1352

Improve regex urlfilters/normalizers synchronization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • nutchgora, 1.6
    • None
    • None
    • Patch Available

    Description

      I noticed that during fetching a lot of the time the fetcherthreads are blocking on a monitor because of outlink normalizing/filtering. The cause of this: Some of the regex plugins use single lock synchronization.

      This patch improves throughput by removing synchronization locks and replace them with threadlocals were needed.

      It has been extensively tested in production. I will commit this later today when no objection.

      Attachments

        1. NUTCH-1352-1.6-1.patch
          15 kB
          Markus Jelsma
        2. NUTCH-1352.patch
          15 kB
          Ferdy

        Activity

          People

            Unassigned Unassigned
            ferdy.g Ferdy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: