Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2398

Fetcher saving redirected robots.txt under redirect target URL

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.13
    • 1.14
    • fetcher
    • None
    • Patch Available

    Description

      NUTCH-2300 lets the Fetcher store optionally the robots.txt response (content and HTTP status). If the '.../robots.txt' is redirected, the redirected content is also stored but with the redirect source URL as key. It should use the redirect target URL instead. Otherwise one of the responses is overwritten in the segments map file.

      Attachments

        Activity

          People

            snagel Sebastian Nagel
            snagel Sebastian Nagel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: