Details
Description
Readdb -stats on a crawldb with 1 record exits with EOFError on Hadoop-0.20.203.0.
Exception in thread "main" java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readFully(DataInputStream.java:152)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
at org.apache.hadoop.mapred.SequenceFileOutputFormat.getReaders(SequenceFileOutputFormat.java:93)
at org.apache.nutch.crawl.CrawlDbReader.processStatJob(CrawlDbReader.java:320)
at org.apache.nutch.crawl.CrawlDbReader.main(CrawlDbReader.java:502)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Attachments
Attachments
Issue Links
- is related to
-
NUTCH-1069 Readlinkdb broken on Hadoop > 0.20
- Closed
- relates to
-
NUTCH-1110 Updatedb must not write _SUCCESS file
- Closed