[CONNECTORS-1386] add user comments to data crawled from confluence - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: ManifoldCF 2.6
Fix Version/s: None
Component/s: Confluence connector
Labels:
None

Description

The confluence crawler skips comments. For a site which uses this as a recorded collaboration platform the comments often are where the text is which needs to be searched.

I've found that by adding `children.comment.body.view` to the `expand` querystring field you can get one level of comments. Subsequent levels can be added to the response by adding children.comment.children.comment.body.view for the second level. 3rd, 4th, 5th levels of comments can be added with the 5th being children.comment.children.comment.children.comment.children.comment.children.comment.body.view

I realize that this doesn't get 100% of the comments but 5 levels of nesting seems like a reasonable chunk to capture.

An alternative would be to crawl comments separately and set the page-type to 'comment' rather than 'page'. While this also has value I think fetching the comments along with the page requests offers the biggest bang for the buck.

Attachments

Activity

People

Assignee:: Rafa Haro

Reporter:: Andrew Shumway

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 24/Feb/17 16:21

Updated:: 24/Feb/17 17:04