Description
The HtmlParser and LinkExtractor do not honor the base element in HTML. This will make crawling of some sites impossible. LinkExtractor and HtmlParser should be able to be given a element/attribute pair to look for a base URI.
Attachments
Attachments
Issue Links
- blocks
-
DROIDS-73 Doesn't honor base element
- Closed