Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2509

Inconsistent behavior in SitemapProcessor

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.14
    • Fix Version/s: 1.15
    • Component/s: sitemap
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      There are two inconsistent behaviors in SitemapProcessor:

      1. There is a member variable maxRedir that is supposed to limit the number of redirections on sitemap URLs, and it is initialized from config property sitemap.redir.max, but it is ignored in the code because a local variable with the same name is defined in the relevant method, and is always set to 3.
      2. When a sitemap URL goes through redirect, it is filtered and normalized. However, if a sitemap URL comes from a sitemapindex, it is not. This seems inconsistent, as in both cases we have a URL from an outside source.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                yossi Yossi Tamari
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: