I think there are two reasons why RAMAccessManager synchronizes on the conglomerate cache instance whenever it accesses it:
1) Because it manually faults in missing items in the cache, and it needs to ensure that no others fault it in between its calls to findCached() and create().
2) Because conglomCacheUpdateEntry() implements a create-or-replace operation, which is not provided by the CacheManager interface, and it needs to ensure no others add an item with the same key between findCached() and create().
As mentioned in an earlier comment, I think (1) should be solved by implementing CacheableConglomerate.setIdentity(), so that the cache manager takes care of faulting in the conglomerate.
(2) might be solved by adding a create-or-replace operation to CacheManager interface. However, I'm not sure it is needed. The conglomCacheUpdateEntry() method is only called once; by RAMTransaction.addColumnToConglomerate(). That method fetches a Conglomerate instance from the cache, modifies it, and reinserts it into the cache. The instance that's reinserted into the cache is the exact same instance that was fetched from the cache, so the call to conglomCacheUpdateEntry() doesn't really update the conglomerate cache, it just replaces an existing entry with itself.
It looks to me as if the conglomCacheUpdateEntry() can be removed, and that will take care of (2).
I created an experimental patch, attached as experimental-v1.diff. It removes conglomCacheUpdateEntry() as suggested. It also makes CacheableConglomerate implement setIdentity() so that conglomCacheFind() doesn't need to fault in conglomerates manually.
The patch is not ready for commit, as it doesn't pass all regression tests. But it could be used for testing, if someone has a test environment where the deadlock can be reliably reproduced.
There was only one failure in the regression tests. store/xaOffline1.sql had a diff in one of the transaction table listings, where a transaction showed up in the ACTIVE state whereas IDLE was expected.
This probably happens because the transaction used in the CacheableConglomerate.setIdentity() method is not necessarily the same as the one previously used by RAMAccessManager.conglomCacheFind().
The current implementation of setIdentity() in the patch just fetches the first transaction it finds on the context stack. That seems to do the trick in most cases, but it doesn't know whether conglomCacheFind() was called with a top-level transaction or a nested transaction, as setIdentity() cannot access conglomCacheFind()'s parameters. Maybe it can be solved by pushing some other context type (with a reference to the correct tx) on the context stack before accessing the conglomerate cache, and let setIdentity() check that instead?