Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-104

Make it easier to limit a web crawl to a single site

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • ManifoldCF 0.1
    • Web connector
    • None

    Description

      Unless the user explicitly enters an include regex carefully, a web crawl can quickly get out of control and start crawling the entire web when all the user may really want is to crawl just a single web site or portion thereof. So, it would be preferable if either by default or with a simple button the crawl could be limited to the seed web site(s).

      Attachments

        Activity

          People

            kwright@metacarta.com Karl Wright
            jkrupan Jack Krupansky
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: