Description
For Nutch we need to extract URL's but need the rel attribute to check for the nofollow value. I've patched the code to return this information in the Link object. It's been tested and i can read the rel in Nutch now.
Thoughts?
Attachments
Attachments
Issue Links
- is depended upon by
-
NUTCH-1233 Rely on Tika for outlink extraction
- Closed