Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-468

Scoring filter should distribute score to all outlinks at once

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.0.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently ScoringFilter.distributeScoreToOutlink, as its name implies, takes only a single outlink and works on that. I would suggest that we change it to distributeScoreToOutlink_s_ so that it would take all the outlinks of a page at once. This has several advantages:

      1) A ScoringFilter plugin returns a single adjust datum to set its score instead of returning several.
      2) A ScoringFilter plugin can change the score of the original page (via adjust datum) even if there are no outlinks. This is useful if you have a ScoringFilter plugin that, say, scores pages based on content instead of outlinks.
      3) Since the ScoringFilter plugin recieves all outlinks at once, it can make better decisions on how to distribute the score. For example, right now it is not possible to create a plugin that always distributes exactly a page's 'cash' to outlinks(that is, if a page has score 5, it will always distribute exactly 5 points to its outlinks no matter what the internal/external factors are) if internal / external score factors are not 1.

        Attachments

        1. scoring-v2.patch
          10 kB
          Dogacan Guney
        2. scoring.patch
          9 kB
          Dogacan Guney

          Activity

            People

            • Assignee:
              dogacan Dogacan Guney
              Reporter:
              dogacan Dogacan Guney
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: