We have a lot of pages on our production author instance. We also have a lot of pages that use sling:vanityPath. Everytime a page is replicated, a new version is created.
We have roughly 168.000 pages in our instance. In the /content node, there are approx. 4500 pages with vanity URLs. In the version storage, however, there are 120.000+ entries that have a sling:vanityPath defined.
When starting up Apache Sling, the Resource Resolver fetches all the existing sling:vanityPath entries, which it fails to do, because of the large amount of sling:vanityPath in the version storage.
In the code, I specifically see checks (when processing the query results) about the version storage. However, this should have been put inside the query as a filter, in order to avoid fetching such a large amount of query result nodes:
I propose to update the query with a "not isdescendantnode"-check, to make sure we do not return any content from the version storage and thus make the default query limits of 100.000 nodes work again.