1. Nutch




Generates a subset of a crawldb to fetch

Issues: Unresolved

Key Summary Due Date
Bug NUTCH-578 URL fetched with 403 is generated over and over again
Bug NUTCH-800 Generator builds a URL list that is not encoded
Improvement NUTCH-1269 Generate main problems

View Issues

Issues: Updated recently

Key Summary Updated
Bug NUTCH-1746 OutOfMemoryError in Mappers
New Feature NUTCH-1741 Support of Sitemaps in Nutch 2.x
Bug NUTCH-1738 Expose number of URLs generated per batch in GeneratorJob

View Issues

Versions: Unreleased

Name Release date
Unreleased 2.3  
Unreleased 2.4  
Unreleased 1.9