The current SharePoint connector only can crawl a single SharePoint site. But SharePoint can support multiple sites. Indeed, in some cases there are hundreds of such sites. Setting up a connection and jobs for each one would be a difficult task.
The SharePoint admin site allows you to discover the sites that exist. Using this feature as part of the crawl would allow for a much more automated way of handling large SharePoint installations.
- Not yet clear how "one site" vs. "many sites" should coexist in one connector
- Form of document identifier must change
- Each document identifier must include the site path first
- Since subsite path can be just "/", also needs to be resilient against that
- Something like: <site_path>//<current_subsite_doc_list_item_etc_path>. But "//" will collide with old-style.
- If old-style document identifier always must start with a "/", then we can simply start it with (say) a "+", to signal that it is a new-style identifier
- Not clear yet if there's a new form that would allow us to know if a doc identifier was old form or not
- Native authority also right now needs to know what site it is working with
- Site discovery therefore must also be run in the authority, and tokens for each discovered site must be returned
- Native tokens must therefore be qualified with a site ID