Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-2413

Parsing fetcher to respect property "parse.filter.urls"

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.13
    • Fix Version/s: 1.14
    • Component/s: fetcher, parser
    • Labels:
      None
    • Environment:

      Apache Nutch release 1.13.

      Description

      In a situation when we want to:
      (1) Execute the fetch and parse together ("fetcher.parse" setting to "true")
      (2) Avoid applying the URL filters when executing this phase.

      Condition (2) can be configured when parsing is executed as a separate process by setting "parse.filter.urls" to "false".
      However, this setting ("parse.filter.urls") is ignored when we execute the fetch and parse phases together.

        Attachments

          Activity

            People

            • Assignee:
              wastl-nagel Sebastian Nagel
              Reporter:
              maborec Marcos Bori
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: