-
Type:
Improvement
-
Status: Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 1.0.0
-
Fix Version/s: 1.0.0
-
Component/s: None
-
Labels:None
-
Patch Info:Patch Available
This is a patch so that Nutch can be used with Hadoop 0.17.0. The patch is located at http://pastie.org/212001
The patch compiles and passes all current Nutch unit tests.
I have tested that the crawler side of Nutch (i.e. inject, generate, fetch, parse, merge w/crawldb) definetly works, but have not tested the lucene indexing part. It might work, but it might not.
NOTE - the two main bugs that had to be overcome were not noticed by any of the unit tests. The bugs only came up during actual testing. The bugs were:
1. Changes to the Hadoop Iterator
2. Addition of Serialization to MapReduce Framework