All Projects : Nutch : 0.9.0 (Fix For Version)

Release Date: 02/Apr/07
Description: Nutch 0.9 release
0.9.0 Nutch 0.9 release 2007-04-02T07-00

 Select:   Summary   Popular Issues   

Summary

Progress: 
 74 of 74 issues have been resolved

Components

(with all issues in each component for this version)
   Bug NUTCH-354 FIXED MapWritable, nextEntry is not reset when Entries are recycled Blocker Closed
   Task NUTCH-400 FIXED Update & add missing license headers Blocker Closed
   Bug NUTCH-273 FIXED When a page is redirected, the original url is NOT updated. Blocker Closed
   Bug NUTCH-332 FIXED doubling score causes by page internal anchors. Blocker Closed
   Bug NUTCH-233 FIXED wrong regular expression hang reduce process for ever Blocker Closed
   Bug NUTCH-336 FIXED Harvested links shouldn't get db.score.injected in addition to inbound contributions Critical Closed
   Bug NUTCH-341 FIXED IndexMerger now deletes entire <workingdir> after completing Critical Closed
   Bug NUTCH-105 FIXED Network error during robots.txt fetch causes file to be ignored Critical Closed
   Improvement NUTCH-167 FIXED Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE"> directive Critical Closed
   Bug NUTCH-361 FIXED generator create fetchlist randomly Critical Closed
   Bug NUTCH-433 FIXED java.io.EOFException in newer nightlies in mergesegs or indexing from hadoop.io.DataOutputBuffer Critical Closed
   Bug NUTCH-318 FIXED log4j not proper configured, readdb doesnt give any information Critical Closed
   Bug NUTCH-350 FIXED urls blocked db.fetch.retry.max * http.max.delays times during fetching are marked as STATUS_DB_GONE Critical Closed
   Bug NUTCH-381 WON'T FIX Ignore external link not work as expected Critical Closed
   Bug NUTCH-277 CANNOT REPRODUCE Fetcher dies because of "max. redirects" (avoiding infinite loop) Critical Closed
   Bug NUTCH-331 CANNOT REPRODUCE Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs Critical Closed
   Bug NUTCH-258 CANNOT REPRODUCE Once Nutch logs a SEVERE log item, Nutch fails forevermore Critical Closed
   Bug NUTCH-417 FIXED After upgrade to hadoop-0.9.1, parsing and indexing doesn't work. Major Closed
   Bug NUTCH-340 FIXED Bug(s) in 0.8 tutorial Major Closed
   Bug NUTCH-347 FIXED Build: plugins' Jars not found Major Closed
   Bug NUTCH-405 FIXED Content object is not properly initialized in map method of ParseSegment Major Closed
   Improvement NUTCH-416 FIXED CrawlDatum status and CrawlDbReducer refactoring Major Closed
   Bug NUTCH-371 FIXED DeleteDuplicates should remove documents with duplicate URLs Major Closed
   Bug NUTCH-367 FIXED DistributedSearch thown ClassCastException Major Closed
   Bug NUTCH-322 FIXED Fetcher discards ProtocolStatus, doesn't store redirected pages Major Closed
   Bug NUTCH-337 FIXED Fetcher ignores the fetcher.parse value configured in config file Major Closed
   Bug NUTCH-344 FIXED Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks Major Closed
   Bug NUTCH-404 FIXED Fix LinkDB Usage - implementation mismatch Major Closed
   Bug NUTCH-418 FIXED Fixes parsing of XHTML (e.g. title) Major Closed
   Improvement NUTCH-365 FIXED Flexible URL normalization Major Closed
   Bug NUTCH-415 FIXED Generate should mark selected records in crawlDB Major Closed
   Bug NUTCH-401 FIXED Hardcoded /tmp directory in SegmentReader Major Closed
   Improvement NUTCH-395 FIXED Increase fetching speed Major Closed
   Bug NUTCH-432 FIXED JAVA_PLATFORM with spaces (i.e. Mac OS X-ppc-32) breaks bin/nutch script Major Closed
   Improvement NUTCH-403 FIXED Make URL filtering optional in Generator Major Closed
   Bug NUTCH-437 FIXED MapFile in Hadoop Trunk has changed, must update references Major Closed
   Improvement NUTCH-378 FIXED MetaWrapper decorator Major Closed
   Bug NUTCH-406 FIXED Metadata tries to write null values Major Closed
   New Feature NUTCH-646 FIXED New Indexing Framework for Nutch Major Closed
   New Feature NUTCH-253 FIXED Normalize Host during Generate Major Closed
   Bug NUTCH-428 FIXED NullPointerException Major Closed
   Improvement NUTCH-614 FIXED Order Inlinks by OPIC score of parent page Major Closed
   Bug NUTCH-379 FIXED ParseUtil does not pass through the content's URL to the ParserFactory Major Closed
   Bug NUTCH-391 FIXED ParseUtil logs file contents to log file when it cannot find parser Major Closed
   Bug NUTCH-384 FIXED Protocol-file plugin does not allow the parse plugins framework to operate properly Major Closed
   Bug NUTCH-362 FIXED Remove parse-text from unsupported filetypes in parse-plugins.xml Major Closed
   Bug NUTCH-394 FIXED Searching via Tomcat / nutch-0.9-dev.war raises exception Major Closed
   Task NUTCH-360 FIXED Switch nutch to use java 5 source format Major Closed
   Bug NUTCH-305 FIXED Update crawl and url filter lists to exclude jpeg|JPEG|bmp|BMP Major Closed
   Improvement NUTCH-459 FIXED Upgrade Nutch to Hadoop 0.12.1 Major Closed
  Viewing 50 of 74 Issues.
Component documentation 2
Component fetcher 21
Component generator 5
Component indexer 7
Component linkdb 1
Component searcher 4
Component web gui 2
  No Component 36

Preset Filters


Version Summary

Resolved Resolved 1
   1%
Closed Closed 73
   99%

Open Issues

By Priority
No issues

By Assignee
No issues