1. Nutch



Issues: Unresolved

Key Summary Due Date
Bug NUTCH-874 Make sure all plugins in src/plugin are compatible with Nutch 2.0 and Gora
Task NUTCH-840 Port tests from parse-html to parse-tika
Wish NUTCH-887 Delegate parsing of feeds to Tika

View Issues

Issues: Updated recently

Key Summary Updated
Improvement NUTCH-2056 Move the Mahout and Lucene dependencies to the plugin from the main ivy.xml for the Naive Bayes Parse Filter (NUTCH-2038)
Improvement NUTCH-2069 Ignore external links based on domain
Bug NUTCH-2071 A parser failure on a single document may fail crawling job

View Issues

Versions: Unreleased

Name Release date
Unreleased 2.4  
Unreleased 1.11  
Unreleased 2.3.1