Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The current namespace handling code does a lot of repetitive work, which shows up in hotspots like XML imports and Sling's namespace mapping code.
Attachments
Attachments
- OAK-924-bench.patch
- 6 kB
- Alex Deparvu
- OAK-924-v0.patch
- 31 kB
- Alex Deparvu
- OAK-924-noinit.patch
- 6 kB
- Alex Deparvu
- OAK-924-nomaps.patch
- 28 kB
- Alex Deparvu
Issue Links
Activity
I ran the benchmark to compare the LoginNSTest test (login + ns map) between existing impls but also to compare with the LoginTest test (just login).
java -Druntime=180 -jar oak-run-*.jar benchmark LoginNSTest LoginTest Oak-Tar Oak-Segment Oak-Default Oak-Mongo Jackrabbit
LoginNSTest | min | 10% | 50% | 90% | max | N |
---|---|---|---|---|---|---|
Jackrabbit | 866 | 884 | 992 | 1169 | 2109 | 170 |
Oak-Default | 1017 | 1047 | 1122 | 1243 | 1528 | 159 |
Oak-Mongo | 1077 | 1195 | 1234 | 1336 | 1508 | 145 |
Oak-Segment | 1672 | 1901 | 2230 | 2368 | 2491 | 83 |
Oak-Tar | 1386 | 1420 | 1604 | 1690 | 2001 | 115 |
LoginTest | min | 10% | 50% | 90% | max | N |
---|---|---|---|---|---|---|
Jackrabbit | 836 | 841 | 852 | 949 | 1328 | 197 |
Oak-Default | 960 | 965 | 977 | 996 | 1493 | 183 |
Oak-Mongo | 971 | 976 | 991 | 1010 | 1090 | 181 |
Oak-Segment | 1365 | 1426 | 1970 | 2174 | 2308 | 97 |
Oak-Tar | 1388 | 1497 | 1532 | 1641 | 1784 | 117 |
Ooh, table! Nice.
In revision 1505898 I added a bit simpler NamespaceTest benchmark that highlights the issue I've seen at least with XML imports:
# NamespaceTest min 10% 50% 90% max N Jackrabbit 6 6 6 7 15 9426 Oak-Default 44 44 45 46 132 1328 Oak-Tar 100 102 102 103 216 584
Much of the observed slowness is caused by getting nodes with anonymous is slow. See OAK-1018. Running the test with the admin user gives about 250% speed up. Caching jcr:system/namespaces across sessions and using a BiMap instead of a plain Map in SessionNameSpaces results in another 250% speed up.
Had a quick chat with Jukka about the state of this issue.
Things I'd like to tackle next are:
- add a commit hook that builds a reverse URI->prefix mapping, this should optimize the access patterns
(this could also keep a list of all registered URIs and prefixes to optimize the #getPrefixes() and #getURIs() calls) - remove the SessionNamespaces#init call
- move the default ns (static) map to the initial content
In rev http://svn.apache.org/r1530299 I added another ns related benchmark (NamespaceRegistryTest).
It seems that there is quite a difference between JR and Oak on this one (90%) JR is at 4ms while Oak is at 400ms.
Just an idea: What about keeping (caching) the whole namespace data fully in memory, in two hash tables or so (one forward mapping, one reverse mapping)? Then, for every write to the namespace storage, update a write counter property at the root of the namespace storage. When reading from the namespace (JCR namespace read calls), check the current write counter, and if it changed, rebuild the cache (re-read the whole namespace data from storage, meaning from the nodes, to the in-memory cache).
This would only work well if there are few writes. But if there are few writes, it should speed up reads a lot, as all reads would only require one lookup (reading the write counter property).
Another limitation would be the size of the namespace data. But as far as I know, there isn't all that much data.
It wouldn't require commit hooks (well, a commit hook could be used to update the write counter). The storage wouldn't need any indexing (redundant reverse mappings).
What about keeping (caching) the whole namespace data fully in memory
The underlying nodes should in most cases be already fully cached in memory, so I'd rather avoid another level of caching (and the related extra complexity) unless it really is needed. The key bottleneck here are the unoptimized data structure and access patterns, not the raw lookup speed as such. For example current the init() method in ReadOnlyNamespaceRegistry has O(n^2) performance.
For comparison, with the node type access we had a similar performance problem caused by lots of unnecessary content accesses. By precompiling the frequently accessed bits into a more efficient content structure I was able to pretty much eliminate the type access overhead from most use cases.
attaching a rough version of the first bits.
I had a lot more trouble moving the default ns into the initial content than expected (hit some encoding issues).
there are some weird failures under oak-jcr, after a point all tests start failing with " OakName0001: Invalid namespace prefix: jcr", but running each one individually doesn't fail so I'm not yet sure what is going on.
included so far:
- removed the hardcoded ns map
- commit hook (the NamespaceValidator which now needs a rename) to update the reverse mapping and dedicated properties for prefixes and uris
todo
- oak-jcr tests don't pass
- remove the init method
feedback needed!
getting a bit ahead of myself: I was looking at the global usage of Namespaces.getNamespaceMap and it looks like there's a lot of calls that come from GlobalNameMapper and LocalNameMapper impls which could benefit from the reverse mapping introduced by the patch.
I would replace the map with a helper class that delegates the calls directly to the namespaces node state.
I've pushed the patch in with some changes, thanks Jukka for the (offline) feedback!
It doesn't look much better, but it's still some work to be done.
I see 2 issues
- acl eval makes the property reads slow (this is why the perf is not there yet)
- the #init keeps a cache of the initial mapping as maps. Unfortunately removing the maps makes the code almost 3.5 times slower.
attaching patch to remove the init call and rely only on repo state directly.
as mentioned before, this is horrible for performance, but maybe there's something I might have overlooked.
atm the code is only commented out (cleanup will follow), but it is useful to get some feedback.
attaching patch to remove the eternal namespaces map passed around in GlobalNameMapper and LocalNameMapper.
the unexpected change is the xml import, so I'd like to have some feedback o this before applying.
passes all the tests!
+1 Looks good.
Do we still need the getReadTree() overrides? I think it should be possible to just pass the Tree instance in the constructor and use it directly without having to look it up repeatedly on each access.
wow, good idea!
I initially ran into some issues trying to remove the NamespaceHelper from the ImportHandler. I ended up using it only for #startPrefixMapping. (ugly-ish I know).
This made me to believe initially that I can't keep a tree instance around because it would get stale somehow.
anyway, code is looking better, running the tests now to see if anything pops up.
ok, the code is in with rev http://svn.apache.org/r1531336
we're now left with one last patch, the #init method removal.
This still needed some extra work, see the changes in revision 1563143.
After those changes we're down to pretty good numbers:
# NamespaceTest C min 10% 50% 90% max N Jackrabbit 1 5 6 6 6 61 9904 Oak-Default 1 5 5 6 6 39 10682 Oak-Tar 1 13 13 13 14 83 4414
Resolving as fixed.
attaching a first take on a benchmark test for the login-registerNS bits that happen in sling on each login call.
I'm not yet sure it's obvious where the slowness comes from, at least it should help get this issue going.