Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.16
    • core, jcr

    Description

      The current namespace handling code does a lot of repetitive work, which shows up in hotspots like XML imports and Sling's namespace mapping code.

      Attachments

        1. OAK-924-bench.patch
          6 kB
          Alex Deparvu
        2. OAK-924-v0.patch
          31 kB
          Alex Deparvu
        3. OAK-924-noinit.patch
          6 kB
          Alex Deparvu
        4. OAK-924-nomaps.patch
          28 kB
          Alex Deparvu

        Issue Links

          Activity

            stillalex Alex Deparvu added a comment -

            attaching a first take on a benchmark test for the login-registerNS bits that happen in sling on each login call.

            I'm not yet sure it's obvious where the slowness comes from, at least it should help get this issue going.

            stillalex Alex Deparvu added a comment - attaching a first take on a benchmark test for the login-registerNS bits that happen in sling on each login call. I'm not yet sure it's obvious where the slowness comes from, at least it should help get this issue going.
            stillalex Alex Deparvu added a comment -

            I ran the benchmark to compare the LoginNSTest test (login + ns map) between existing impls but also to compare with the LoginTest test (just login).

            java -Druntime=180 -jar oak-run-*.jar benchmark LoginNSTest LoginTest Oak-Tar Oak-Segment Oak-Default Oak-Mongo Jackrabbit
            
            LoginNSTest min 10% 50% 90% max N
            Jackrabbit 866 884 992 1169 2109 170
            Oak-Default 1017 1047 1122 1243 1528 159
            Oak-Mongo 1077 1195 1234 1336 1508 145
            Oak-Segment 1672 1901 2230 2368 2491 83
            Oak-Tar 1386 1420 1604 1690 2001 115
            LoginTest min 10% 50% 90% max N
            Jackrabbit 836 841 852 949 1328 197
            Oak-Default 960 965 977 996 1493 183
            Oak-Mongo 971 976 991 1010 1090 181
            Oak-Segment 1365 1426 1970 2174 2308 97
            Oak-Tar 1388 1497 1532 1641 1784 117
            stillalex Alex Deparvu added a comment - I ran the benchmark to compare the LoginNSTest test (login + ns map) between existing impls but also to compare with the LoginTest test (just login). java -Druntime=180 -jar oak-run-*.jar benchmark LoginNSTest LoginTest Oak-Tar Oak-Segment Oak-Default Oak-Mongo Jackrabbit LoginNSTest min 10% 50% 90% max N Jackrabbit 866 884 992 1169 2109 170 Oak-Default 1017 1047 1122 1243 1528 159 Oak-Mongo 1077 1195 1234 1336 1508 145 Oak-Segment 1672 1901 2230 2368 2491 83 Oak-Tar 1386 1420 1604 1690 2001 115 LoginTest min 10% 50% 90% max N Jackrabbit 836 841 852 949 1328 197 Oak-Default 960 965 977 996 1493 183 Oak-Mongo 971 976 991 1010 1090 181 Oak-Segment 1365 1426 1970 2174 2308 97 Oak-Tar 1388 1497 1532 1641 1784 117
            jukkaz Jukka Zitting added a comment -

            Ooh, table! Nice.

            In revision 1505898 I added a bit simpler NamespaceTest benchmark that highlights the issue I've seen at least with XML imports:

            # NamespaceTest                  min     10%     50%     90%     max       N
            Jackrabbit                         6       6       6       7      15    9426
            Oak-Default                       44      44      45      46     132    1328
            Oak-Tar                          100     102     102     103     216     584
            
            jukkaz Jukka Zitting added a comment - Ooh, table! Nice. In revision 1505898 I added a bit simpler NamespaceTest benchmark that highlights the issue I've seen at least with XML imports: # NamespaceTest min 10% 50% 90% max N Jackrabbit 6 6 6 7 15 9426 Oak-Default 44 44 45 46 132 1328 Oak-Tar 100 102 102 103 216 584
            mduerig Michael DĂĽrig added a comment - - edited

            Much of the observed slowness is caused by getting nodes with anonymous is slow. See OAK-1018. Running the test with the admin user gives about 250% speed up. Caching jcr:system/namespaces across sessions and using a BiMap instead of a plain Map in SessionNameSpaces results in another 250% speed up.

            mduerig Michael DĂĽrig added a comment - - edited Much of the observed slowness is caused by getting nodes with anonymous is slow. See OAK-1018 . Running the test with the admin user gives about 250% speed up. Caching jcr:system/namespaces across sessions and using a BiMap instead of a plain Map in SessionNameSpaces results in another 250% speed up.
            stillalex Alex Deparvu added a comment -

            Had a quick chat with Jukka about the state of this issue.

            Things I'd like to tackle next are:

            • add a commit hook that builds a reverse URI->prefix mapping, this should optimize the access patterns
              (this could also keep a list of all registered URIs and prefixes to optimize the #getPrefixes() and #getURIs() calls)
            • remove the SessionNamespaces#init call
            • move the default ns (static) map to the initial content
            stillalex Alex Deparvu added a comment - Had a quick chat with Jukka about the state of this issue. Things I'd like to tackle next are: add a commit hook that builds a reverse URI->prefix mapping, this should optimize the access patterns (this could also keep a list of all registered URIs and prefixes to optimize the #getPrefixes() and #getURIs() calls) remove the SessionNamespaces#init call move the default ns (static) map to the initial content
            stillalex Alex Deparvu added a comment -

            In rev http://svn.apache.org/r1530299 I added another ns related benchmark (NamespaceRegistryTest).
            It seems that there is quite a difference between JR and Oak on this one (90%) JR is at 4ms while Oak is at 400ms.

            stillalex Alex Deparvu added a comment - In rev http://svn.apache.org/r1530299 I added another ns related benchmark (NamespaceRegistryTest). It seems that there is quite a difference between JR and Oak on this one (90%) JR is at 4ms while Oak is at 400ms.
            thomasm Thomas Mueller added a comment - - edited

            Just an idea: What about keeping (caching) the whole namespace data fully in memory, in two hash tables or so (one forward mapping, one reverse mapping)? Then, for every write to the namespace storage, update a write counter property at the root of the namespace storage. When reading from the namespace (JCR namespace read calls), check the current write counter, and if it changed, rebuild the cache (re-read the whole namespace data from storage, meaning from the nodes, to the in-memory cache).

            This would only work well if there are few writes. But if there are few writes, it should speed up reads a lot, as all reads would only require one lookup (reading the write counter property).

            Another limitation would be the size of the namespace data. But as far as I know, there isn't all that much data.

            It wouldn't require commit hooks (well, a commit hook could be used to update the write counter). The storage wouldn't need any indexing (redundant reverse mappings).

            thomasm Thomas Mueller added a comment - - edited Just an idea: What about keeping (caching) the whole namespace data fully in memory, in two hash tables or so (one forward mapping, one reverse mapping)? Then, for every write to the namespace storage, update a write counter property at the root of the namespace storage. When reading from the namespace (JCR namespace read calls), check the current write counter, and if it changed, rebuild the cache (re-read the whole namespace data from storage, meaning from the nodes, to the in-memory cache). This would only work well if there are few writes. But if there are few writes, it should speed up reads a lot, as all reads would only require one lookup (reading the write counter property). Another limitation would be the size of the namespace data. But as far as I know, there isn't all that much data. It wouldn't require commit hooks (well, a commit hook could be used to update the write counter). The storage wouldn't need any indexing (redundant reverse mappings).
            jukkaz Jukka Zitting added a comment -

            What about keeping (caching) the whole namespace data fully in memory

            The underlying nodes should in most cases be already fully cached in memory, so I'd rather avoid another level of caching (and the related extra complexity) unless it really is needed. The key bottleneck here are the unoptimized data structure and access patterns, not the raw lookup speed as such. For example current the init() method in ReadOnlyNamespaceRegistry has O(n^2) performance.

            For comparison, with the node type access we had a similar performance problem caused by lots of unnecessary content accesses. By precompiling the frequently accessed bits into a more efficient content structure I was able to pretty much eliminate the type access overhead from most use cases.

            jukkaz Jukka Zitting added a comment - What about keeping (caching) the whole namespace data fully in memory The underlying nodes should in most cases be already fully cached in memory, so I'd rather avoid another level of caching (and the related extra complexity) unless it really is needed. The key bottleneck here are the unoptimized data structure and access patterns, not the raw lookup speed as such. For example current the init() method in ReadOnlyNamespaceRegistry has O(n^2) performance. For comparison, with the node type access we had a similar performance problem caused by lots of unnecessary content accesses. By precompiling the frequently accessed bits into a more efficient content structure I was able to pretty much eliminate the type access overhead from most use cases.
            stillalex Alex Deparvu added a comment -

            attaching a rough version of the first bits.
            I had a lot more trouble moving the default ns into the initial content than expected (hit some encoding issues).

            there are some weird failures under oak-jcr, after a point all tests start failing with " OakName0001: Invalid namespace prefix: jcr", but running each one individually doesn't fail so I'm not yet sure what is going on.

            included so far:

            • removed the hardcoded ns map
            • commit hook (the NamespaceValidator which now needs a rename) to update the reverse mapping and dedicated properties for prefixes and uris

            todo

            • oak-jcr tests don't pass
            • remove the init method

            feedback needed!

            stillalex Alex Deparvu added a comment - attaching a rough version of the first bits. I had a lot more trouble moving the default ns into the initial content than expected (hit some encoding issues). there are some weird failures under oak-jcr, after a point all tests start failing with " OakName0001: Invalid namespace prefix: jcr", but running each one individually doesn't fail so I'm not yet sure what is going on. included so far: removed the hardcoded ns map commit hook (the NamespaceValidator which now needs a rename) to update the reverse mapping and dedicated properties for prefixes and uris todo oak-jcr tests don't pass remove the init method feedback needed!
            stillalex Alex Deparvu added a comment -

            getting a bit ahead of myself: I was looking at the global usage of Namespaces.getNamespaceMap and it looks like there's a lot of calls that come from GlobalNameMapper and LocalNameMapper impls which could benefit from the reverse mapping introduced by the patch.
            I would replace the map with a helper class that delegates the calls directly to the namespaces node state.

            stillalex Alex Deparvu added a comment - getting a bit ahead of myself: I was looking at the global usage of Namespaces.getNamespaceMap and it looks like there's a lot of calls that come from GlobalNameMapper and LocalNameMapper impls which could benefit from the reverse mapping introduced by the patch. I would replace the map with a helper class that delegates the calls directly to the namespaces node state.
            stillalex Alex Deparvu added a comment -

            I've pushed the patch in with some changes, thanks Jukka for the (offline) feedback!

            It doesn't look much better, but it's still some work to be done.

            I see 2 issues

            • acl eval makes the property reads slow (this is why the perf is not there yet)
            • the #init keeps a cache of the initial mapping as maps. Unfortunately removing the maps makes the code almost 3.5 times slower.
            stillalex Alex Deparvu added a comment - I've pushed the patch in with some changes, thanks Jukka for the (offline) feedback! It doesn't look much better, but it's still some work to be done. I see 2 issues acl eval makes the property reads slow (this is why the perf is not there yet) the #init keeps a cache of the initial mapping as maps. Unfortunately removing the maps makes the code almost 3.5 times slower.
            stillalex Alex Deparvu added a comment -

            attaching patch to remove the init call and rely only on repo state directly.
            as mentioned before, this is horrible for performance, but maybe there's something I might have overlooked.

            atm the code is only commented out (cleanup will follow), but it is useful to get some feedback.

            stillalex Alex Deparvu added a comment - attaching patch to remove the init call and rely only on repo state directly. as mentioned before, this is horrible for performance, but maybe there's something I might have overlooked. atm the code is only commented out (cleanup will follow), but it is useful to get some feedback.
            stillalex Alex Deparvu added a comment -

            attaching patch to remove the eternal namespaces map passed around in GlobalNameMapper and LocalNameMapper.

            the unexpected change is the xml import, so I'd like to have some feedback o this before applying.

            passes all the tests!

            stillalex Alex Deparvu added a comment - attaching patch to remove the eternal namespaces map passed around in GlobalNameMapper and LocalNameMapper. the unexpected change is the xml import, so I'd like to have some feedback o this before applying. passes all the tests!
            jukkaz Jukka Zitting added a comment -

            +1 Looks good.

            Do we still need the getReadTree() overrides? I think it should be possible to just pass the Tree instance in the constructor and use it directly without having to look it up repeatedly on each access.

            jukkaz Jukka Zitting added a comment - +1 Looks good. Do we still need the getReadTree() overrides? I think it should be possible to just pass the Tree instance in the constructor and use it directly without having to look it up repeatedly on each access.
            stillalex Alex Deparvu added a comment -

            wow, good idea!

            I initially ran into some issues trying to remove the NamespaceHelper from the ImportHandler. I ended up using it only for #startPrefixMapping. (ugly-ish I know).
            This made me to believe initially that I can't keep a tree instance around because it would get stale somehow.

            anyway, code is looking better, running the tests now to see if anything pops up.

            stillalex Alex Deparvu added a comment - wow, good idea! I initially ran into some issues trying to remove the NamespaceHelper from the ImportHandler . I ended up using it only for #startPrefixMapping. (ugly-ish I know). This made me to believe initially that I can't keep a tree instance around because it would get stale somehow. anyway, code is looking better, running the tests now to see if anything pops up.
            stillalex Alex Deparvu added a comment -

            ok, the code is in with rev http://svn.apache.org/r1531336

            we're now left with one last patch, the #init method removal.

            stillalex Alex Deparvu added a comment - ok, the code is in with rev http://svn.apache.org/r1531336 we're now left with one last patch, the #init method removal.
            jukkaz Jukka Zitting added a comment -

            This still needed some extra work, see the changes in revision 1563143.

            After those changes we're down to pretty good numbers:

            # NamespaceTest                    C     min     10%     50%     90%     max       N
            Jackrabbit                         1       5       6       6       6      61    9904
            Oak-Default                        1       5       5       6       6      39   10682
            Oak-Tar                            1      13      13      13      14      83    4414
            

            Resolving as fixed.

            jukkaz Jukka Zitting added a comment - This still needed some extra work, see the changes in revision 1563143. After those changes we're down to pretty good numbers: # NamespaceTest C min 10% 50% 90% max N Jackrabbit 1 5 6 6 6 61 9904 Oak-Default 1 5 5 6 6 39 10682 Oak-Tar 1 13 13 13 14 83 4414 Resolving as fixed.
            stillalex Alex Deparvu added a comment -

            bulk close for the 0.16 release

            stillalex Alex Deparvu added a comment - bulk close for the 0.16 release

            People

              jukkaz Jukka Zitting
              jukkaz Jukka Zitting
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: