Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1392

Add option for Web connector to ignore robots instructions in meta tags

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • ManifoldCF 2.7
    • Web connector
    • None

    Description

      The Web connectors already allows to ignore robots.txt by option.

      With this ticket, another option is added, to allow the connector to ignore robots instructions in <meta name="robots ... tags.

      Proposal (to be discussed)

      Add a new option list "Page level robots instructions" to the "Robots" Tab. List entries:

      1. Obey meta robots tags (the default)
      2. Don't took at meta robots tags

      The end user doc needs to be updated.

      Google ressources on robot instructions in HTML pages:
      [0] https://support.google.com/webmasters/answer/79812?hl=en&ctx=cb&src=cb&cbid=tnnsjq5jcodt&cbrank=4
      [1] https://support.google.com/webmasters/answer/96569?hl=en&ctx=cb&src=cb&cbid=-5rmggrfsp2rq&cbrank=3
      [2] https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?csw=1

      Thread on the mailing list
      [3] https://www.mail-archive.com/user@manifoldcf.apache.org/msg03258.html

      Attachments

        1. CONNECTORS-1392.patch
          9 kB
          Markus Schuch
        2. CONNECTORS-1392-2.patch
          11 kB
          Markus Schuch
        3. CONNECTORS-1392-3.patch
          18 kB
          Markus Schuch
        4. web-configure-robots.PNG
          40 kB
          Markus Schuch

        Activity

          People

            schuch Markus Schuch
            schuch Markus Schuch
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: