History
Log In
h
ome
b
rowse project
f
ind issues
Q
uick Search:
Learn more about
Quick Search
All Projects
:
Nutch
: 0.9.0
(Fix For Version)
Release Date:
02/Apr/07
Description:
Nutch 0.9 release
0.9.0
Nutch 0.9 release
2007-04-02T07-00
Select:
Summary
Popular Issues
Summary
Issues:
All |
Unresolved
Progress:
74
of
74
issues have been resolved
Components
(with all issues in each component for this version)
NUTCH-354
FIXED
MapWritable, nextEntry is not reset when Entries are recycled
NUTCH-400
FIXED
Update & add missing license headers
NUTCH-273
FIXED
When a page is redirected, the original url is NOT updated.
NUTCH-332
FIXED
doubling score causes by page internal anchors.
NUTCH-233
FIXED
wrong regular expression hang reduce process for ever
NUTCH-336
FIXED
Harvested links shouldn't get db.score.injected in addition to inbound contributions
NUTCH-341
FIXED
IndexMerger now deletes entire <workingdir> after completing
NUTCH-105
FIXED
Network error during robots.txt fetch causes file to be ignored
NUTCH-167
FIXED
Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE"> directive
NUTCH-361
FIXED
generator create fetchlist randomly
NUTCH-433
FIXED
java.io.EOFException in newer nightlies in mergesegs or indexing from hadoop.io.DataOutputBuffer
NUTCH-318
FIXED
log4j not proper configured, readdb doesnt give any information
NUTCH-350
FIXED
urls blocked db.fetch.retry.max * http.max.delays times during fetching are marked as STATUS_DB_GONE
NUTCH-381
WON'T FIX
Ignore external link not work as expected
NUTCH-277
CANNOT REPRODUCE
Fetcher dies because of "max. redirects" (avoiding infinite loop)
NUTCH-331
CANNOT REPRODUCE
Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs
NUTCH-258
CANNOT REPRODUCE
Once Nutch logs a SEVERE log item, Nutch fails forevermore
NUTCH-417
FIXED
After upgrade to hadoop-0.9.1, parsing and indexing doesn't work.
NUTCH-340
FIXED
Bug(s) in 0.8 tutorial
NUTCH-347
FIXED
Build: plugins' Jars not found
NUTCH-405
FIXED
Content object is not properly initialized in map method of ParseSegment
NUTCH-416
FIXED
CrawlDatum status and CrawlDbReducer refactoring
NUTCH-371
FIXED
DeleteDuplicates should remove documents with duplicate URLs
NUTCH-367
FIXED
DistributedSearch thown ClassCastException
NUTCH-322
FIXED
Fetcher discards ProtocolStatus, doesn't store redirected pages
NUTCH-337
FIXED
Fetcher ignores the fetcher.parse value configured in config file
NUTCH-344
FIXED
Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks
NUTCH-404
FIXED
Fix LinkDB Usage - implementation mismatch
NUTCH-418
FIXED
Fixes parsing of XHTML (e.g. title)
NUTCH-365
FIXED
Flexible URL normalization
NUTCH-415
FIXED
Generate should mark selected records in crawlDB
NUTCH-401
FIXED
Hardcoded /tmp directory in SegmentReader
NUTCH-395
FIXED
Increase fetching speed
NUTCH-432
FIXED
JAVA_PLATFORM with spaces (i.e. Mac OS X-ppc-32) breaks bin/nutch script
NUTCH-403
FIXED
Make URL filtering optional in Generator
NUTCH-437
FIXED
MapFile in Hadoop Trunk has changed, must update references
NUTCH-378
FIXED
MetaWrapper decorator
NUTCH-406
FIXED
Metadata tries to write null values
NUTCH-646
FIXED
New Indexing Framework for Nutch
NUTCH-253
FIXED
Normalize Host during Generate
NUTCH-428
FIXED
NullPointerException
NUTCH-614
FIXED
Order Inlinks by OPIC score of parent page
NUTCH-379
FIXED
ParseUtil does not pass through the content's URL to the ParserFactory
NUTCH-391
FIXED
ParseUtil logs file contents to log file when it cannot find parser
NUTCH-384
FIXED
Protocol-file plugin does not allow the parse plugins framework to operate properly
NUTCH-362
FIXED
Remove parse-text from unsupported filetypes in parse-plugins.xml
NUTCH-394
FIXED
Searching via Tomcat / nutch-0.9-dev.war raises exception
NUTCH-360
FIXED
Switch nutch to use java 5 source format
NUTCH-305
FIXED
Update crawl and url filter lists to exclude jpeg|JPEG|bmp|BMP
NUTCH-459
FIXED
Upgrade Nutch to Hadoop 0.12.1
Viewing 50 of
74
Issues.
documentation
2
fetcher
21
generator
5
indexer
7
linkdb
1
searcher
4
web gui
2
No Component
36
Preset Filters
-
All
-
Outstanding
-
Most important
-
Resolved recently
-
Added recently
-
Updated recently
Version Summary
Resolved
1
1%
Closed
73
99%
Open Issues
By Priority
No issues
By Assignee
No issues