Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1573

Web Crawler exclude from index matches too much?

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: ManifoldCF 2.10
    • Fix Version/s: None
    • Component/s: Web connector
    • Labels:
      None

      Description

      Hello, 

      I'm not sure this is a bug, or my misinterpretation of the exclusion rules:

      I want to set-up a rule, so that it does NOT index a parentpage, but does index all childpages of that parent:

      I'm setting up a rule: 

      Inclusions: 

      .*

       

      Exclustions:

      http://www.website.com/nl/

      (I've tried also: http://www.website.com/nl/(\s)* )

      No dice, I'f I'm looking at the logs, I see the pages are crawled, but not indexed due to job restriction. Is my rule wrong? Or is this a small bug?

       

      Thanks for advice!

       

       

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Kornelito Korneel Staelens
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: