Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-917

SharePoint connector would benefit from site discovery

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • ManifoldCF 1.7
    • ManifoldCF next
    • SharePoint connector
    • None

    Description

      The current SharePoint connector only can crawl a single SharePoint site. But SharePoint can support multiple sites. Indeed, in some cases there are hundreds of such sites. Setting up a connection and jobs for each one would be a difficult task.

      The SharePoint admin site allows you to discover the sites that exist. Using this feature as part of the crawl would allow for a much more automated way of handling large SharePoint installations.

      Some notes:

      • Not yet clear how "one site" vs. "many sites" should coexist in one connector
      • Form of document identifier must change
      • Each document identifier must include the site path first
      • Since subsite path can be just "/", also needs to be resilient against that
      • Something like: <site_path>//<current_subsite_doc_list_item_etc_path>. But "//" will collide with old-style.
      • If old-style document identifier always must start with a "/", then we can simply start it with (say) a "+", to signal that it is a new-style identifier
      • Not clear yet if there's a new form that would allow us to know if a doc identifier was old form or not
      • Native authority also right now needs to know what site it is working with
      • Site discovery therefore must also be run in the authority, and tokens for each discovered site must be returned
      • Native tokens must therefore be qualified with a site ID

      Attachments

        Activity

          People

            kwright@metacarta.com Karl Wright
            kwright@metacarta.com Karl Wright
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: