Details
Description
The -list command of SegmentReader fails to read data from segments:
% bin/nutch readseg -list crawl/segments/20180409100315/ Exception in thread "main" java.io.IOException: wrong value class: is not class org.apache.nutch.crawl.CrawlDatum at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2379) at org.apache.nutch.segment.SegmentReader.getStats(SegmentReader.java:524) at org.apache.nutch.segment.SegmentReader.list(SegmentReader.java:482) at org.apache.nutch.segment.SegmentReader.run(SegmentReader.java:670) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.segment.SegmentReader.main(SegmentReader.java:736)
Attachments
Issue Links
- is caused by
-
NUTCH-2375 Upgrade the code base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce
- Closed
- links to