Nutch
  1. Nutch

injector

Summary

Description

Takes a flat file of URLs and adds them to the crawldb as pages to be crawled

Issues: Unresolved

Key Summary Due Date
Bug NUTCH-1472 InvalidRequestException(why:(String didn't validate.) [webpage][f][ts] failed validation)
Improvement NUTCH-1712 Use MultipleInputs in Injector to make it a single mapreduce job
Bug NUTCH-1746 OutOfMemoryError in Mappers

View Issues

Issues: Updated recently

Key Summary Updated
Improvement NUTCH-1712 Use MultipleInputs in Injector to make it a single mapreduce job
Bug NUTCH-2170 When i am crawling the URL http://www.aossama.com/. it is crawling url like this com.aossama.www.http/
Bug NUTCH-2114 kkk

View Issues

Versions: Unreleased

Name Release date
Unreleased 2.4  
Unreleased 1.12  
Unreleased 2.4.1  
Unreleased 1.13