Uploaded image for project: 'Nutch'
  1. Nutch
  2. NUTCH-1421

RegexURLNormalizer to only skip rules with invalid patterns

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • nutchgora, 1.6
    • 1.6, 2.2
    • None
    • None
    • Patch Available

    Description

      If a regex-normalize.xml file contains one rule with a syntactically invalid regular expression patterns, all rules are discarded and no normalization is done.

      In combination with a detailed error message, RegexURLNormalizer should only skip the invalid rule but use all other (valid) rules.

      Attachments

        1. NUTCH-1421-1.patch
          1 kB
          Sebastian Nagel

        Activity

          People

            Unassigned Unassigned
            snagel Sebastian Nagel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: