1. Nutch




Takes a flat file of URLs and adds them to the crawldb as pages to be crawled

Issues: Unresolved

Key Summary Due Date
Bug NUTCH-1472 InvalidRequestException(why:(String didn't validate.) [webpage][f][ts] failed validation)
Improvement NUTCH-1712 Use MultipleInputs in Injector to make it a single mapreduce job
Bug NUTCH-1746 OutOfMemoryError in Mappers

View Issues

Issues: Updated recently

Key Summary Updated
Bug NUTCH-2170 When i am crawling the URL it is crawling url like this com.aossama.www.http/
Bug NUTCH-2114 kkk
Bug NUTCH-2080 Eclipse compilation issue

View Issues

Versions: Unreleased

Name Release date
Unreleased 2.4  
Unreleased 1.11  
Unreleased 2.3.1  
Unreleased 1.12  
Unreleased 2.4.1