Details
Description
When I do a crawl of these pages with microformats-reltag activated, I get loads of garbage included within my dump of the webdb.
http://www.amazon.com/Degree-Antiperspirant-Deodorant-Extreme-Blast/dp/B001ET769Y
http://www.amazon.com/Cisco-WAP4410N-Wireless-N-Access-Point/dp/B001IYCMNA
metadata Rel-Tag : �^A^@^B^@^@^@^Pget_range_slices^@^@^@^B^O^@^@^L^@^@^A^W^K^@^A^@^@^@(com.amazon.www:http/review/RZJZBDJMTYN4Y^O^@^B^L^@^@^@^B^L^@^B^K^@^A^@^@^@^Bil^O^@^B^L^@^@^@^B^K^@^A^@^@^@Jhttp://www.amazon.com/Cisco-WAP4410N-Wireless-N-Access-Point/dp/B001IYCMNA^K^@^B^@^@^@(Horrible Device, Two Years of Experience ^@^C^@^D�]����^@^K^@^A^@^@^@Qhttp://www.amazon.com/Degree-Antiperspirant-Deodorant-Extreme-Blast/dp/B001ET769Y^K^@^B^@^@^@(Horrible Device, Two Years of Experience ^@^C^@^D�]����^@^@^@^L^@^B^K^@^A^@^@^@^Bmk^O^@^B^L^@^@^@^A^K^@^A^@^@^@^Ddist^K^@^B^@^@^@^A1
Attachments
Issue Links
- relates to
-
NUTCH-1420 Get rid of the dreaded �
- Closed