History
Log In
h
ome
b
rowse project
f
ind issues
Q
uick Search:
Learn more about
Quick Search
Filter:
View
Edit
New
Manage
You are currently using a new, unsaved search.
Summary
Project:
Nutch
Resolutions:
Unresolved
Priorities:
Minor
Sorted by:
Key descending
Operations
Issue Navigator
[
Permlink
]
Displaying issues
1
to
50
of
83
matching issues.
Current View:
Browser
(
Current Fields
|
Printable
|
Full Content
)
|
XML
| RSS
(
Issues
|
Comments
)
|
Word
| Excel
(
All fields
|
Current fields
)
1
|
2
|
Next >>
T
Patch Info
Key
Summary
Assignee
Reporter
Pr
Status
Res
Created
Updated
Due
Patch Available
NUTCH-769
Fetcher to skip queues for URLS getting repeated exceptions
Unassigned
Julien Nioche
Open
UNRESOLVED
23/Nov/09
23/Nov/09
NUTCH-763
Separate configuration files from resources to be included in the job file
Unassigned
Julien Nioche
Open
UNRESOLVED
05/Nov/09
05/Nov/09
Patch Available
NUTCH-761
Avoid cloningCrawlDatum in CrawlDbReducer
Unassigned
Julien Nioche
Open
UNRESOLVED
03/Nov/09
03/Nov/09
NUTCH-759
Removal of deprecated APIs
Unassigned
Stephen Norman
Open
UNRESOLVED
14/Oct/09
14/Oct/09
Patch Available
NUTCH-750
HtmlParser plugin - page title extraction
Unassigned
Alexey Torochkov
Open
UNRESOLVED
29/Aug/09
29/Aug/09
Patch Available
NUTCH-741
Job file includes multiple copies of nutch config files.
Unassigned
Kirby Bohling
Open
UNRESOLVED
29/May/09
29/May/09
Patch Available
NUTCH-740
Configuration option to override default language for fetched pages.
Otis Gospodnetic
Marcin Okraszewski
Open
UNRESOLVED
28/May/09
09/Jun/09
Patch Available
NUTCH-738
Close SegmentUpdater when FetchedSegments is closed
Unassigned
Martina Koch
Open
UNRESOLVED
26/May/09
04/Aug/09
Patch Available
NUTCH-737
urlnormalizer-unalias plugin
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
26/May/09
26/May/09
Patch Available
NUTCH-718
urlfilter-subnets plugin
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
12/Mar/09
13/Mar/09
Patch Available
NUTCH-713
Config options for webgraph Scoring not documented
Unassigned
Eric J. Christeson
Open
UNRESOLVED
09/Mar/09
09/Mar/09
NUTCH-710
Support for rel="canonical" attribute
Unassigned
Frank McCown
Open
UNRESOLVED
03/Mar/09
03/Mar/09
NUTCH-706
Url regex normalizer
Unassigned
Meghna Kukreja
Open
UNRESOLVED
27/Feb/09
26/Mar/09
Patch Available
NUTCH-705
parse-rtf plugin
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
27/Feb/09
10/Mar/09
Patch Available
NUTCH-693
Add configurable option for treating nofollow behaviour.
Otis Gospodnetic
Andrew McCall
Open
UNRESOLVED
18/Feb/09
28/May/09
NUTCH-690
bug in DomContentUtils.shouldThrowAwayLink?
Unassigned
Peter Sparks
Open
UNRESOLVED
17/Feb/09
17/Feb/09
NUTCH-673
Upgrade the Carrot2 plug-in to release 3.0
Unassigned
Sean Dean
Open
UNRESOLVED
15/Dec/08
06/Feb/09
NUTCH-670
feed plugin does not parse RSS2 enclosures
Unassigned
Todd Lipcon
Open
UNRESOLVED
10/Dec/08
12/Jan/09
NUTCH-664
Possibility to update already stored documents.
Unassigned
Sergey Khilkov
Open
UNRESOLVED
26/Nov/08
21/Jan/09
Patch Available
NUTCH-655
Injecting Crawl metadata
Unassigned
Julien Nioche
Open
UNRESOLVED
01/Oct/08
23/Jan/09
NUTCH-648
debian style autocomplete
Unassigned
Jim
Open
UNRESOLVED
28/Aug/08
27/Dec/08
Patch Available
NUTCH-638
Launching Distributed Searchers with URI indicating filesystem to use rather than relying on hadoop config files.
Unassigned
Aaron Nall
Open
UNRESOLVED
28/Jul/08
13/Oct/08
NUTCH-625
Non-ascii character broken in dumped content for mixed encoding (utf-8 and multi-byte)
Unassigned
Vinci
Open
UNRESOLVED
30/Mar/08
27/Nov/08
NUTCH-609
Allow Plugins to be Loaded from Jar File(s)
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
09/Feb/08
17/Feb/09
13/Feb/08
NUTCH-589
Hierarchical Classloaders
Unassigned
Ryan Levering
Open
UNRESOLVED
05/Dec/07
05/Dec/07
NUTCH-585
[PARSE-HTML plugin] Block certain parts of HTML code from being indexed
Unassigned
Andrea Spinelli
Open
UNRESOLVED
29/Nov/07
29/Oct/09
NUTCH-577
Use explicit tika-config.xml file to enable mime magic detection to be turned on and off
Chris A. Mattmann
Chris A. Mattmann
Open
UNRESOLVED
17/Nov/07
20/Feb/09
30/Nov/07
NUTCH-576
Different Analyzers Support
Unassigned
Rajasekar Karthik
Open
UNRESOLVED
14/Nov/07
14/Nov/07
Patch Available
NUTCH-570
Improvement of URL Ordering in Generator.java
Otis Gospodnetic
Ned Rockson
Open
UNRESOLVED
26/Oct/07
21/May/08
NUTCH-569
Protocol plugins should report progress to the fetcher
Unassigned
Andrzej Bialecki
Open
UNRESOLVED
23/Oct/07
23/Oct/07
NUTCH-566
Sun's URL class has bug in creation of relative query URLs
Unassigned
Doug Cook
Open
UNRESOLVED
10/Oct/07
14/Mar/08
NUTCH-564
External parser supports encoding attribute
Unassigned
Antony Bowesman
Open
UNRESOLVED
03/Oct/07
17/Feb/09
NUTCH-542
Null Pointer Exception on getSummary when segment no longer exists
Unassigned
Jeff V.
Open
UNRESOLVED
20/Aug/07
20/Aug/07
NUTCH-490
Extension point with filters for Neko HTML parser (with patch)
Unassigned
Marcin Okraszewski
Open
UNRESOLVED
22/May/07
27/May/09
Patch Available
NUTCH-477
Extend URLFilters to support different filtering chains
Andrzej Bialecki
Andrzej Bialecki
Open
UNRESOLVED
03/May/07
24/Apr/09
NUTCH-470
Adding optional terms to a query
Unassigned
Trond Andersen
Open
UNRESOLVED
24/Apr/07
09/May/07
NUTCH-453
Move stop words to a config file
Unassigned
Steve Severance
Open
UNRESOLVED
02/Mar/07
02/Mar/07
NUTCH-449
Format of junit output should be configurable
Doug Cutting
Nigel Daley
Open
UNRESOLVED
23/Feb/07
23/Feb/07
NUTCH-435
Synonym-Editor that creates OWL for the ontology plugin
Unassigned
Urs Krebs
Open
UNRESOLVED
26/Jan/07
29/Mar/07
NUTCH-431
Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins
Chris A. Mattmann
Chris A. Mattmann
Open
UNRESOLVED
20/Jan/07
26/Jan/07
NUTCH-427
protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation.
Unassigned
Armel Nene
Open
UNRESOLVED
05/Jan/07
08/Nov/08
NUTCH-423
Add other index-basic fields as query plugins
Unassigned
stack
Open
UNRESOLVED
29/Dec/06
29/Dec/06
NUTCH-412
plugin to parse the feed-url (rss/atom) of a blog
Unassigned
Renaud Richardet
Open
UNRESOLVED
03/Dec/06
14/Sep/07
NUTCH-410
Faster RegexNormalize with more features
Unassigned
Doug Cook
Open
UNRESOLVED
29/Nov/06
29/Nov/06
NUTCH-409
Add "short circuit" notion to filters to speedup mixed site/subsite crawling
Unassigned
Doug Cook
Open
UNRESOLVED
26/Nov/06
26/Nov/06
NUTCH-396
mergesegs sorts URLs, making segments useless for subsequent fetch
Unassigned
Doug Cook
Open
UNRESOLVED
03/Nov/06
03/Nov/06
NUTCH-389
a url tokenizer implementation for tokenizing index fields : url and host
Unassigned
Enis Soztutar
Open
UNRESOLVED
20/Oct/06
07/Nov/06
NUTCH-386
Plugin to index categories by url rules
Unassigned
Ernesto De Santis
Open
UNRESOLVED
14/Oct/06
16/May/09
NUTCH-363
Fetcher normalizes everything at least twice
Unassigned
Doug Cook
Open
UNRESOLVED
08/Sep/06
16/Jan/08
NUTCH-355
The title of query result could like the summary have the highlight??
Unassigned
King Kong
Open
UNRESOLVED
20/Aug/06
22/Sep/08
1
|
2
|
Next >>