Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-124

protocol-httpclient does not follow redirects when fetching robots.txt

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.7.2, 0.8
    • 0.8
    • fetcher
    • None

    Description

      If a site's robots.txt redirects, protocol-httpclient does not correctly fetch the robots.txt and effectively ignores it for the site. See http://www.webmasterworld.com/forum11/3008.htm.

      Attachments

        Activity

          People

            Unassigned Unassigned
            cutting Doug Cutting
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: