Description
Configuration properties db.score.link.external and db.score.link.internal are ignored.
In case of e.g. message board webpages or pages that have larger navigation menus on each page having a lower impact of internal links makes a lot of sense for scoring.
Also for web spam this is a serious problem, since now spammers can setup just one domain with dynamically generated pages and this highly manipulate the nutch scores.
So I also suggest that we give db.score.link.internal by default a value of something like 0.25.
Attachments
Attachments
Issue Links
- is duplicated by
-
NUTCH-276 db.score.link.internal problem
- Closed