History
Log In
h
ome
b
rowse project
f
ind issues
Q
uick Search:
Learn more about
Quick Search
Filter:
View
Edit
New
Manage
You are currently using a new, unsaved search.
Summary
Project:
Nutch
Resolutions:
Unresolved
Priorities:
Major
Sorted by:
Key descending
Operations
Issue Navigator
[
Permlink
]
Displaying issues
1
to
50
of
101
matching issues.
Current View:
Browser
(
Current Fields
|
Printable
|
Full Content
)
|
XML
| RSS
(
Issues
|
Comments
)
|
Word
| Excel
(
All fields
|
Current fields
)
1
|
2
|
3
|
Next >>
T
Patch Info
Key
Summary
Assignee
Reporter
Pr
Status
Res
Created
Updated
Due
NUTCH-771
Add WebGraph classes to the bin/nutch script
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
24/Nov/09
24/Nov/09
27/Nov/09
Patch Available
NUTCH-770
Timebomb for Fetcher
Unassigned
Julien Nioche
Open
UNRESOLVED
23/Nov/09
23/Nov/09
NUTCH-768
Upgrade Nutch 1.0 to use Hadoop 0.20
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
21/Nov/09
24/Nov/09
24/Nov/09
Patch Available
NUTCH-767
Update version of Tika for the MimeType detection
Chris A. Mattmann
Julien Nioche
Open
UNRESOLVED
18/Nov/09
18/Nov/09
Patch Available
NUTCH-766
Tika parser
Unassigned
Julien Nioche
Open
UNRESOLVED
18/Nov/09
18/Nov/09
Patch Available
NUTCH-762
Alternative Generator which can generate several segments in one parse of the crawlDB
Unassigned
Julien Nioche
Open
UNRESOLVED
03/Nov/09
03/Nov/09
Patch Available
NUTCH-760
Allow field mapping from nutch to solr index
Unassigned
David Stuart
Open
UNRESOLVED
15/Oct/09
27/Oct/09
NUTCH-755
DomainURLFilter crashes on malformed URL
Unassigned
Mike Baranczak
Open
UNRESOLVED
17/Sep/09
26/Oct/09
NUTCH-753
Prevent new Fetcher to retrieve the robots twice
Unassigned
Julien Nioche
Open
UNRESOLVED
07/Sep/09
07/Sep/09
NUTCH-751
Upgrade version of HttpClient
Unassigned
Julien Nioche
Open
UNRESOLVED
04/Sep/09
09/Sep/09
Patch Available
NUTCH-747
inject&Index metadatas and inherit these metadatas to all matching suburls
Unassigned
Marko Bauhardt
Open
UNRESOLVED
06/Aug/09
06/Aug/09
Patch Available
NUTCH-746
NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container.
Unassigned
Kirby Bohling
Open
UNRESOLVED
26/Jul/09
04/Aug/09
NUTCH-745
MyHtmlParser getParse return not null,so all Analyzer-(zh|fr) cannot run
Unassigned
jcore_XiaTian
Open
UNRESOLVED
10/Jul/09
10/Jul/09
NUTCH-739
SolrDeleteDuplications too slow when using hadoop
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
28/May/09
29/May/09
NUTCH-734
option to filter "a" tag text
Unassigned
ron
Open
UNRESOLVED
02/May/09
02/May/09
Patch Available
NUTCH-733
plain text view of cached files ignores HTML encoding
Unassigned
Ilguiz Latypov
Open
UNRESOLVED
30/Apr/09
07/Jun/09
NUTCH-729
NPE in FieldIndexer when BasicFields url doesn't exist
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
25/Mar/09
23/Jun/09
26/Mar/09
NUTCH-728
Improve nutch release packaging
Unassigned
Sami Siren
Open
UNRESOLVED
19/Mar/09
20/Mar/09
NUTCH-719
fetchQueues.totalSize incorrect in Fetcher2
Unassigned
Julien Nioche
Open
UNRESOLVED
12/Mar/09
13/Jul/09
NUTCH-717
Make Nutch Solr integration easier
Unassigned
Sami Siren
Open
UNRESOLVED
10/Mar/09
09/Jul/09
Patch Available
NUTCH-716
Make subcollection index filed multivalued
Unassigned
Dmitry Lihachev
Open
UNRESOLVED
10/Mar/09
22/May/09
NUTCH-714
Need a SFTP and SCP Protocol Handler
Chris A. Mattmann
Sanjoy Ghosh
Open
UNRESOLVED
10/Mar/09
24/Mar/09
Patch Available
NUTCH-712
ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers
Unassigned
Julien Nioche
Open
UNRESOLVED
06/Mar/09
06/Mar/09
NUTCH-709
JSParseFilter gets into an infinate loop and ets all the stack
Unassigned
Tim Hawkins
Open
UNRESOLVED
03/Mar/09
07/Jun/09
NUTCH-708
NutchBean: OOM due to searcher.max.hits and dedup.
Unassigned
Aaron Binns
Open
UNRESOLVED
01/Mar/09
01/Mar/09
NUTCH-692
AlreadyBeingCreatedException with Hadoop 0.19
Unassigned
Julien Nioche
Open
UNRESOLVED
18/Feb/09
16/Sep/09
NUTCH-689
Swf parser doesn't seem to handle relative links
Unassigned
Peter Sparks
Open
UNRESOLVED
17/Feb/09
18/Feb/09
NUTCH-685
Content-level redirect status lost in ParseSegment
Andrzej Bialecki
Andrzej Bialecki
Open
UNRESOLVED
06/Feb/09
06/Feb/09
Patch Available
NUTCH-677
Segment merge filering based on segment content
Unassigned
Marcin Okraszewski
Open
UNRESOLVED
08/Jan/09
08/Oct/09
NUTCH-674
NutchBean doesn't check for searcher.dir existance.
Unassigned
Kuba Kończyk
Open
UNRESOLVED
18/Dec/08
18/Dec/08
NUTCH-666
Analysis plugins for multiple language and new Language Identifier Tool
Dennis Kubes
Dennis Kubes
Open
UNRESOLVED
26/Nov/08
23/Jan/09
27/Nov/08
NUTCH-650
Hbase Integration
Doğacan Güney
Doğacan Güney
Open
UNRESOLVED
18/Sep/08
16/Aug/09
NUTCH-649
Log list of files found but not crawled.
Unassigned
Jim
Open
UNRESOLVED
28/Aug/08
28/Aug/08
Patch Available
NUTCH-644
RTF parser doesn't compile anymore
Unassigned
Guillaume Smet
Open
UNRESOLVED
08/Aug/08
27/Feb/09
Patch Available
NUTCH-629
Detect slow and timeout servers and drop their URLs
Otis Gospodnetic
Otis Gospodnetic
Open
UNRESOLVED
12/Apr/08
21/May/08
NUTCH-628
Host database to keep track of host-level information
Unassigned
Otis Gospodnetic
Open
UNRESOLVED
12/Apr/08
28/Jan/09
NUTCH-622
Support for application/x-suggestions+json
Unassigned
Bobby Hubbard
Open
UNRESOLVED
26/Mar/08
26/Mar/08
NUTCH-619
Another Language Identifier Plugin using Unicode code point range
Unassigned
Vinci
Open
UNRESOLVED
15/Mar/08
15/Mar/08
NUTCH-595
"Target file:/.... already exists"
Unassigned
Andrzej Bialecki
Open
UNRESOLVED
27/Dec/07
20/Jan/08
NUTCH-583
FeedParser empty links for items
Enis Soztutar
Enis Soztutar
Open
UNRESOLVED
27/Nov/07
18/Feb/09
Patch Available
NUTCH-578
URL fetched with 403 is generated over and over again
Dennis Kubes
Nathaniel Powell
In Progress
UNRESOLVED
20/Nov/07
31/Mar/09
Patch Available
NUTCH-573
Multiple Domains - Query Search
Enis Soztutar
Rajasekar Karthik
Open
UNRESOLVED
07/Nov/07
11/Nov/09
NUTCH-568
Indexer does not update the Lucene "TITLE" field
Unassigned
smorales
Open
UNRESOLVED
19/Oct/07
22/Oct/07
NUTCH-558
Need tool to retrieve domain statistics
Chris Schneider
Chris Schneider
In Progress
UNRESOLVED
19/Sep/07
03/Feb/09
NUTCH-541
Index url field untokenized
Enis Soztutar
Enis Soztutar
Open
UNRESOLVED
09/Aug/07
20/Feb/09
NUTCH-540
some problem about the Nutch cache
Unassigned
crossany
Open
UNRESOLVED
09/Aug/07
17/Feb/09
NUTCH-537
TestMP3Parser.java, TestRTFParser.java, TestMSWordParser.java compile
Unassigned
Hasan Diwan
Open
UNRESOLVED
07/Aug/07
08/Aug/07
NUTCH-523
web2 searchform problems with patch
Unassigned
Hal Finkel
Open
UNRESOLVED
21/Jul/07
21/Jul/07
NUTCH-521
Modified injector to allow newly injected CrawlDatum to overwrite original
Unassigned
Rob Young
Open
UNRESOLVED
19/Jul/07
19/Jul/07
NUTCH-519
prased incorrectly
Unassigned
Chris Hane
Open
UNRESOLVED
18/Jul/07
18/Jul/07
1
|
2
|
3
|
Next >>