Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-468

Scoring filter should distribute score to all outlinks at once

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.0.0
    • 1.0.0
    • None
    • None

    Description

      Currently ScoringFilter.distributeScoreToOutlink, as its name implies, takes only a single outlink and works on that. I would suggest that we change it to distributeScoreToOutlink_s_ so that it would take all the outlinks of a page at once. This has several advantages:

      1) A ScoringFilter plugin returns a single adjust datum to set its score instead of returning several.
      2) A ScoringFilter plugin can change the score of the original page (via adjust datum) even if there are no outlinks. This is useful if you have a ScoringFilter plugin that, say, scores pages based on content instead of outlinks.
      3) Since the ScoringFilter plugin recieves all outlinks at once, it can make better decisions on how to distribute the score. For example, right now it is not possible to create a plugin that always distributes exactly a page's 'cash' to outlinks(that is, if a page has score 5, it will always distribute exactly 5 points to its outlinks no matter what the internal/external factors are) if internal / external score factors are not 1.

      Attachments

        1. scoring-v2.patch
          10 kB
          Dogacan Guney
        2. scoring.patch
          9 kB
          Dogacan Guney

        Activity

          People

            dogacan Dogacan Guney
            dogacan Dogacan Guney
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: