Issue Details (XML | Word | Printable)

Key: NUTCH-338
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Trivial Trivial
Assignee: Chris A. Mattmann
Reporter: Chris A. Mattmann
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Nutch

Remove the text parser as an option for parsing PDF files in parse-plugins.xml

Created: 03/Aug/06 03:32 PM   Updated: 24/Sep/06 03:30 PM
Return to search
Component/s: fetcher
Affects Version/s: 0.8
Fix Version/s: 0.8.1, 0.9.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works NUTCH-338.Mattmann.patch.txt 2006-08-03 03:34 PM Chris A. Mattmann 0.4 kB
Environment: Mac Book Pro Dual Core Intel 2.1 Ghz, although improvement is independent of environment
Issue Links:
Incorporates
 
Reference
 

Resolution Date: 18/Aug/06 03:11 PM


 Description  « Hide
After some discussion on the mailing list, it was decided that parse-text should not really be an option to parse PDF content. So, this issue includes a trivial patch to remove the parse text plugin from being mapped to PDF content in parse-pugins.xml.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #432615 Fri Aug 18 15:12:12 UTC 2006 siren NUTCH-338 - Remove the text parser as an option for parsing PDF files in parse-plugins.xml (Chris A. Mattmann)
Files Changed
MODIFY /lucene/nutch/trunk/conf/parse-plugins.xml
MODIFY /lucene/nutch/trunk/CHANGES.txt

Repository Revision Date User Message
ASF #432794 Sat Aug 19 04:37:29 UTC 2006 siren NUTCH-338 - Remove the text parser as an option for parsing PDF files in parse-plugins.xml (Chris A. Mattmann)
Files Changed
MODIFY /lucene/nutch/branches/branch-0.8/conf/parse-plugins.xml
MODIFY /lucene/nutch/branches/branch-0.8/CHANGES.txt