History
Log In
h
ome
b
rowse project
f
ind issues
Q
uick Search:
Learn more about
Quick Search
Filter:
View
Edit
New
Manage
You are currently using a new, unsaved search.
Summary
Project:
Nutch
Status:
Open
Sorted by:
Key descending
Operations
Issue Navigator
[
Permlink
]
Displaying issues
1
to
50
of
202
matching issues.
Current View:
Browser
(
Current Fields
|
Printable
|
Full Content
)
|
XML
| RSS
(
Issues
|
Comments
)
|
Word
| Excel
(
All fields
|
Current fields
)
1
|
2
|
3
|
4
|
5
|
Next >>
T
Patch Info
Key
Summary
Assignee
Reporter
Pr
Status
Res
Created
Updated
Due
NUTCH-771
Add WebGraph classes to the bin/nutch script
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
24/Nov/09
24/Nov/09
27/Nov/09
Patch Available
NUTCH-770
Timebomb for Fetcher
Unassigned
Julien Nioche
Open
UNRESOLVED
23/Nov/09
23/Nov/09
Patch Available
NUTCH-769
Fetcher to skip queues for URLS getting repeated exceptions
Unassigned
Julien Nioche
Open
UNRESOLVED
23/Nov/09
23/Nov/09
NUTCH-768
Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
21/Nov/09
24/Nov/09
24/Nov/09
Patch Available
NUTCH-767
Update version of Tika for the MimeType detection
Chris A. Mattmann
Julien Nioche
Open
UNRESOLVED
18/Nov/09
18/Nov/09
Patch Available
NUTCH-766
Tika parser
Unassigned
Julien Nioche
Open
UNRESOLVED
18/Nov/09
18/Nov/09
Patch Available
NUTCH-764
Add support for vfsfile:// loading of plugins for JBoss
Unassigned
tcurran@approachingpi.com
Open
UNRESOLVED
10/Nov/09
10/Nov/09
NUTCH-763
Separate configuration files from resources to be included in the job file
Unassigned
Julien Nioche
Open
UNRESOLVED
05/Nov/09
05/Nov/09
Patch Available
NUTCH-762
Alternative Generator which can generate several segments in one parse of the crawlDB
Unassigned
Julien Nioche
Open
UNRESOLVED
03/Nov/09
03/Nov/09
Patch Available
NUTCH-761
Avoid cloningCrawlDatum in CrawlDbReducer
Unassigned
Julien Nioche
Open
UNRESOLVED
03/Nov/09
03/Nov/09
Patch Available
NUTCH-760
Allow field mapping from nutch to solr index
Unassigned
David Stuart
Open
UNRESOLVED
15/Oct/09
27/Oct/09
NUTCH-759
Removal of deprecated APIs
Unassigned
Stephen Norman
Open
UNRESOLVED
14/Oct/09
14/Oct/09
NUTCH-755
DomainURLFilter crashes on malformed URL
Unassigned
Mike Baranczak
Open
UNRESOLVED
17/Sep/09
26/Oct/09
NUTCH-753
Prevent new Fetcher to retrieve the robots twice
Unassigned
Julien Nioche
Open
UNRESOLVED
07/Sep/09
07/Sep/09
NUTCH-751
Upgrade version of HttpClient
Unassigned
Julien Nioche
Open
UNRESOLVED
04/Sep/09
09/Sep/09
Patch Available
NUTCH-750
HtmlParser plugin - page title extraction
Unassigned
Alexey Torochkov
Open
UNRESOLVED
29/Aug/09
29/Aug/09
Patch Available
NUTCH-747
inject&Index metadatas and inherit these metadatas to all matching suburls
Unassigned
Marko Bauhardt
Open
UNRESOLVED
06/Aug/09
06/Aug/09
Patch Available
NUTCH-746
NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container.
Unassigned
Kirby Bohling
Open
UNRESOLVED
26/Jul/09
04/Aug/09
NUTCH-745
MyHtmlParser getParse return not null,so all Analyzer-(zh|fr) cannot run
Unassigned
jcore_XiaTian
Open
UNRESOLVED
10/Jul/09
10/Jul/09
Patch Available
NUTCH-741
Job file includes multiple copies of nutch config files.
Unassigned
Kirby Bohling
Open
UNRESOLVED
29/May/09
29/May/09
Patch Available
NUTCH-740
Configuration option to override default language for fetched pages.
Otis Gospodnetic
Marcin Okraszewski
Open
UNRESOLVED
28/May/09
09/Jun/09
NUTCH-739
SolrDeleteDuplications too slow when using hadoop
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
28/May/09
29/May/09
Patch Available
NUTCH-738
Close SegmentUpdater when FetchedSegments is closed
Unassigned
Martina Koch
Open
UNRESOLVED
26/May/09
04/Aug/09
Patch Available
NUTCH-737
urlnormalizer-unalias plugin
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
26/May/09
26/May/09
NUTCH-734
option to filter "a" tag text
Unassigned
ron
Open
UNRESOLVED
02/May/09
02/May/09
Patch Available
NUTCH-733
plain text view of cached files ignores HTML encoding
Unassigned
Ilguiz Latypov
Open
UNRESOLVED
30/Apr/09
07/Jun/09
NUTCH-732
Subcollection plugin not working on Nutch-1.0
Unassigned
Filipe Antunes
Open
UNRESOLVED
07/Apr/09
07/Apr/09
NUTCH-729
NPE in FieldIndexer when BasicFields url doesn't exist
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
25/Mar/09
23/Jun/09
26/Mar/09
NUTCH-728
Improve nutch release packaging
Unassigned
Sami Siren
Open
UNRESOLVED
19/Mar/09
20/Mar/09
NUTCH-719
fetchQueues.totalSize incorrect in Fetcher2
Unassigned
Julien Nioche
Open
UNRESOLVED
12/Mar/09
13/Jul/09
Patch Available
NUTCH-718
urlfilter-subnets plugin
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
12/Mar/09
13/Mar/09
NUTCH-717
Make Nutch Solr integration easier
Unassigned
Sami Siren
Open
UNRESOLVED
10/Mar/09
09/Jul/09
Patch Available
NUTCH-716
Make subcollection index filed multivalued
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
10/Mar/09
22/May/09
NUTCH-714
Need a SFTP and SCP Protocol Handler
Chris A. Mattmann
Sanjoy Ghosh
Open
UNRESOLVED
10/Mar/09
24/Mar/09
Patch Available
NUTCH-713
Config options for webgraph Scoring not documented
Unassigned
Eric J. Christeson
Open
UNRESOLVED
09/Mar/09
09/Mar/09
Patch Available
NUTCH-712
ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers
Unassigned
Julien Nioche
Open
UNRESOLVED
06/Mar/09
06/Mar/09
NUTCH-710
Support for rel="canonical" attribute
Unassigned
Frank McCown
Open
UNRESOLVED
03/Mar/09
03/Mar/09
NUTCH-709
JSParseFilter gets into an infinate loop and ets all the stack
Unassigned
Tim Hawkins
Open
UNRESOLVED
03/Mar/09
07/Jun/09
NUTCH-708
NutchBean: OOM due to searcher.max.hits and dedup.
Unassigned
Aaron Binns
Open
UNRESOLVED
01/Mar/09
01/Mar/09
NUTCH-706
Url regex normalizer
Unassigned
Meghna Kukreja
Open
UNRESOLVED
27/Feb/09
26/Mar/09
Patch Available
NUTCH-705
parse-rtf plugin
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
27/Feb/09
10/Mar/09
Patch Available
NUTCH-697
Generate log output for solr indexer and dedup
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
20/Feb/09
20/Feb/09
Patch Available
NUTCH-693
Add configurable option for treating nofollow behaviour.
Otis Gospodnetic
Andrew McCall
Open
UNRESOLVED
18/Feb/09
28/May/09
NUTCH-692
AlreadyBeingCreatedException with Hadoop 0.19
Unassigned
Julien Nioche
Open
UNRESOLVED
18/Feb/09
16/Sep/09
NUTCH-690
bug in DomContentUtils.shouldThrowAwayLink?
Unassigned
Peter Sparks
Open
UNRESOLVED
17/Feb/09
17/Feb/09
NUTCH-689
Swf parser doesn't seem to handle relative links
Unassigned
Peter Sparks
Open
UNRESOLVED
17/Feb/09
18/Feb/09
NUTCH-685
Content-level redirect status lost in ParseSegment
Andrzej Bialecki
Andrzej Bialecki
Open
UNRESOLVED
06/Feb/09
06/Feb/09
Patch Available
NUTCH-677
Segment merge filering based on segment content
Unassigned
Marcin Okraszewski
Open
UNRESOLVED
08/Jan/09
08/Oct/09
NUTCH-674
NutchBean doesn't check for searcher.dir existance.
Unassigned
Kuba Kończyk
Open
UNRESOLVED
18/Dec/08
18/Dec/08
NUTCH-673
Upgrade the Carrot2 plug-in to release 3.0
Unassigned
Sean Dean
Open
UNRESOLVED
15/Dec/08
06/Feb/09
1
|
2
|
3
|
4
|
5
|
Next >>