Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.3, 1.11
-
None
-
None
Description
Parsechecker and indexchecker fail to fetch valid URLs containing percent-encoded characters. The percent-encoding is broken by escaping % again:
% bin/nutch parsechecker 'https://de.wikipedia.org/wiki/%C3%84sop' fetching: https://de.wikipedia.org/wiki/%25C3%2584sop Fetch failed with protocol status: gone(11), lastModified=0: https://de.wikipedia.org/wiki/%25C3%2584sop
Attachments
Issue Links
- is part of
-
NUTCH-2012 Merge parsechecker and indexchecker
- Closed