Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2398

Fetcher saving redirected robots.txt under redirect target URL

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.13
    • Fix Version/s: 1.14
    • Component/s: fetcher
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      NUTCH-2300 lets the Fetcher store optionally the robots.txt response (content and HTTP status). If the '.../robots.txt' is redirected, the redirected content is also stored but with the redirect source URL as key. It should use the redirect target URL instead. Otherwise one of the responses is overwritten in the segments map file.

        Attachments

          Activity

            People

            • Assignee:
              wastl-nagel Sebastian Nagel
              Reporter:
              wastl-nagel Sebastian Nagel
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: