Issue Details (XML | Word | Printable)

Key: JCR-257
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Jukka Zitting
Reporter: Marcel Reutegger
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Jackrabbit Content Repository

Use separate index for jcr:system tree

Created: 19/Oct/05 12:42 AM   Updated: 13/Oct/06 04:07 PM
Return to search
Component/s: None
Affects Version/s: 0.9
Fix Version/s: 1.0

Time Tracking:
Not Specified

Issue Links:
Cloners
 

Resolution Date: 24/Mar/06 11:22 PM


 Description  « Hide
Currently each workspace index also includes index data of repository wide data (e.g. version nodes under jcr:system). There are several drawbacks with this approach:

- indexing is duplicated and does not scale when using a lot of workspaces
- workspaces cannot be 'put to sleep' when they are not actively used.

The repository should have an additional index for system data, which includes: versioning and nodetype representation in content. Basically data under /jcr:system.

Queries issued on a workspace will then use two index to execute the query: the workspace index and the system index.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Marcel Reutegger added a comment - 20/Dec/05 08:13 PM
Separated index as proposed. There is one repository wide system index that contains /jcr:system tree. In addition to this 'global' index there are still the workspace indexes as before. Queries are executed on both indexes and will return results from both indexes.

Separating the indexes now also allows to disable indexing of versions. One simply does not configure a system index on the repository level.

Important note: this causes a minor backward compatibility issue. Existing configurations do not have a system search index configured on the repository level and will not index versions anymore. That means, queries will return versions of nodes that have been checked in before this code change but no checkins after this change. Apart from that Jackrabbit will work just fine. If you need to search versions of nodes see below how.

Migration instructions:
- add a SearchIndex element at the end of the repository configuration. See jackrabbit/src/main/config/repository.xml for an example
- delete index folders in all your workspace directories
- restart jackrabbit (will re-index workspaces and jcr:system tree)

svn revision: 357961

Marcel Reutegger added a comment - 20/Dec/05 08:21 PM
Also updated repository.xml files in contrib projects.

Przemo Pakulski added a comment - 17/Mar/06 09:03 PM
I checked r386604 and it looks that doesn't work as expected.

Even if 'repository wide system index that contains /jcr:system tree' is not configured, following nodes are still indexed : versionLabels, versionStorage, versionHistory, version, frozenNode.

Additionally index is duplicated over all workspaces, what lead to huge index size and performance/memory problems especially if we use many workspaces.

What's interesting is that if I remove all index folders, and restart repository then all indexes are rebuilded without mentioned nodes, and indexes are much smaller then.

Marcel Reutegger added a comment - 21/Mar/06 12:46 AM
Paths of events that origin from the version storage are wrong and thus are not filtered correctly anymore. This problem was probably introduced when fixing JCR-141.

Test cases should be extended to check paths of version events.

Marcel Reutegger added a comment - 23/Mar/06 05:48 PM
Committed a preliminary fix.

Most of the changes are actually located in the event system and not the search index itself. It would have probably made more sense to reopen JCR-141, but anyway...

I strongly encourage everyone to re-index an existing repository that was used with a jackrabbit svn revision between 368026 and 388123 and that is now upgraded to a higher version! Indexing of versions was completely broken between those svn revision!

I'm currently writing some additional test cases and will resolve this issue once I have commited the test cases.

Marcel Reutegger added a comment - 24/Mar/06 11:22 PM
Fixed in revision: 388559

Also added some test cases that check observation and querying on version storage.

Jukka Zitting added a comment - 29/Mar/06 02:09 AM
Merged to 1.0 in revision 389544.