Issue Details (XML | Word | Printable)

Key: NUTCH-279
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Andrzej Bialecki
Reporter: Stefan Neufeind
Votes: 3
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Nutch

Additions for regex-normalize

Created: 22/May/06 08:09 PM   Updated: 10/Apr/09 12:29 PM
Return to search
Component/s: None
Affects Version/s: 0.8
Fix Version/s: 1.0.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works regex-normalize.patch 2006-05-22 08:12 PM Stefan Neufeind 4 kB
Text File Licensed for inclusion in ASF works regex-normalize2.patch 2006-07-09 10:32 PM Stefan Neufeind 4 kB
Issue Links:
Incorporates
 

Resolution Date: 03/Feb/09 03:16 PM


 Description  « Hide
Imho needed:
1) Extend normalize-rules to commonly used session-id's etc.
2) Ship a checker to check rules easily by hand

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #740318 Tue Feb 03 15:12:48 UTC 2009 ab NUTCH-279 Additions to urlnormalizer-regex (modified).
Files Changed
ADD /lucene/nutch/trunk/src/java/org/apache/nutch/net/URLNormalizerChecker.java
MODIFY /lucene/nutch/trunk/CHANGES.txt