|
Chris A. Mattmann made changes - 29/Sep/07 04:36 AM
Chris A. Mattmann made changes - 29/Sep/07 04:30 PM
[
Permlink
| « Hide
]
Chris A. Mattmann added a comment - 29/Sep/07 04:30 PM
This will allow an XML parser to properly be called because the appropriate mime type is detected.
Initial patch for comments:
1. This patch removes the MimeType system, and its associated java src files, config files and unit tests from Nutch. This information is in Tika now and is replaced by its TIka counterparts. Okay, so if someone gets a chance please run a small crawl with this in the next few days and let us know how it works. Otherwise, I'll do the same myself in a couple days and if there are no objections, I'd like to commit this then.
Chris A. Mattmann made changes - 07/Oct/07 03:32 PM
Tika 0.1 unrelased jar file – drop this in $NUTCH_SRC_HOME/lib
Chris A. Mattmann made changes - 07/Oct/07 03:33 PM
Chris A. Mattmann made changes - 09/Oct/07 12:24 AM
Chris A. Mattmann made changes - 09/Oct/07 12:24 AM
Integrated in Nutch-Nightly #231 (See http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/231/
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||