After a lengthy discussion here: http://www.nabble.com/NodeTypeRegistry.checkForReferencesInContent%28%29-tf2882955.html
(tried to move the thread to the developer forum, but got rejected for some reason), attached is a proposed implementation.
What this does is, it takes a fairly simplistic "fail-fast" approach. checkForReferencesInContent() executes a query on each workspace searching for nodes of a given type. If any are found, it doesnt try to fix the type, but rather throws a RepositoryException, forcing the caller to deal with its own data issues.
There are two major issues with this implementation (see thread)
1. Searching is done by utilizing a query manager, which in turn uses indexes that are outside the physical node storage. This could be a problem, but currently nothing else will scale.
2. Concurrency. There are four concurrency related use cases that I can forsee at this point.
a. Not REALLY a concurrency issue, but needs to be addressed. Node is added by a call to addNode(), type is deleted, session is saved by a call to ItemImpl.save() (it looks like everything calls ItemImpl.save() in the end of the day).
In this case, I dont think much care needs to be taken. All that will happen is, a NoSuchNodeTypeFound exception will be thrown. Which, I think, is fine.
b. Node type is deleted, new node of type we just deleted is added, session is saved.
Same as above. Also not a real concurrency issue, but has to be brough up. Let it throw
c. Now here is the interesting case. While checkForReferenciesInContent() is running, a node of the given type is being persisted by a call to save().
I saw a lot of discussion on the subject, so I'm pretty sure I'm missing something here, but it seems to me that since we know that no two instances of the RepositoryImpl class are supposed to operate on any given physical repository (see thread), all we really have to do is synchronized ItemImpl.save() and NodeTypeRegistry.checkForReferencesInContent() on the repository object.
And that is what the attached patch tries to do . Other then these concerns that I just brought up, one more thing kind of bothers me. In order to check each workspace, I had to get a SystemSession. To do that, I had to expose the RepositoryImpl.getSystemSession() method as public. Do feel free to express your opinions on this.