Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1573

Web Crawler exclude from index matches too much?

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • ManifoldCF 2.10
    • None
    • Web connector
    • None

    Description

      Hello, 

      I'm not sure this is a bug, or my misinterpretation of the exclusion rules:

      I want to set-up a rule, so that it does NOT index a parentpage, but does index all childpages of that parent:

      I'm setting up a rule: 

      Inclusions: 

      .*

       

      Exclustions:

      http://www.website.com/nl/

      (I've tried also: http://www.website.com/nl/(\s)* )

      No dice, I'f I'm looking at the logs, I see the pages are crawled, but not indexed due to job restriction. Is my rule wrong? Or is this a small bug?

       

      Thanks for advice!

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Kornelito Korneel Staelens
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: