Description
I recently came across this rather stagnant codebase[0] which is ASL v2.0 licensed and appears to have been used successfully to parse sitemaps as per the discussion here[1].
[0] http://sourceforge.net/projects/sitemap-parser/
[1] http://lucene.472066.n3.nabble.com/Support-for-Sitemap-Protocol-and-Canonical-URLs-td630060.html
Attachments
Attachments
Issue Links
- is related to
-
NUTCH-1622 Create Outlinks with metadata
- Closed
-
NUTCH-1741 Support of Sitemaps in Nutch 2.x
- Closed
- links to