Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-3041

Address confusing logging in o.a.n.net.URLExemptionFilters

    XMLWordPrintableJSON

Details

    • Task
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.19, 1.20
    • 1.21
    • net
    • None

    Description

      URLExemptionFilter impementations are used to allow exemptions to external domain resources by overriding the db.ignore.external.links configuration setting. This is useful when the crawl is focused to a domain but resources like images are hosted on CDN.

      Currently URLExemptionFilters] provides the following logging

      INFO o.a.n.n.URLExemptionFilters LocalJobRunner Map Task Executor #0 Found 0 extensions at point:'org.apache.nutch.net.URLExemptionFilter'

      I find this confusing. It would be better to log only if an URLExemptionFilter implementation is actually configured to be used at runtime.

      I will provide a patch for this.

      Attachments

        Issue Links

          Activity

            People

              lewismc Lewis John McGibbney
              lewismc Lewis John McGibbney
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: