The problem here is that the (re)registering defined in
is not encapsulated in one and only one transaction and the transactions are not blocking other clusters. To change the data of the PortletApplication it uses methods of the PersistenceBrokerPortletRegistry which are encapsulated by transactions for removing and creating a portlet application.
Since (re)registering removes and inserts data from the database which is not fully encapsulated by one transaction and not write locked, there maybe conflicts. A sample:
A = cluster node 1
B = cluster node 2
- A removes PA from DB
- B removes PA from DB again (with no effect)
- A inserts PA into DB
- B inserts PA into DB (exists with duplicat key constraints violation)
What would be the options:
1.) Make sure only one cluster node can (re)deploy the portlet application at once.
A first approach could be be:
- delete and insert should only be executed, if not executed yet by another cluster node
- to synchronize add a kind of "monitor" to the database (e.g. new table with monitoring "flag" and optimistic locking)
- every cluster node checks the monitor
- if monitor not set, the cluster node sets it and executes the deletion/insert stuff
- if monitor set, the cluster node waits until monitor is "free" and only reloads the registry (with the already written Portlet Application by the other cluster node)
- if both cluster nodes want to update the monitor, optimistic locking leads to an exception on one side. that side then also should wait and reload
- make sure the cluster node retries to (re)deploy the portlet application on exception (see 2.))
2.) Catch the exception, roll back and keep on trying to (re)deploy the porlet.xml
I am not sure if this is a good solution because multiple transactions on multiple cluster nodes could produce invalid data in the database tables or deadlocks? (I am not an clustered eviroment database expert )
3.) change the (re) deploy process:
- avoid deletion of the portlet application
- step trough the object tree and insert/update only if necessary
- combine this with optimistic locking (requires data model change)
4.) another slick solution that makes everything much easier (maybe at OJB level?)
I would like to synchronize with the core developers before starting to implement a solution. What do you think?
The quickest solution for now with the least impact on data model and code base would be 2.), but I am not sure if this is a really robust solution. Please comment.
To generally avoid problems in clustered environments we maybe have to change some aspects of the database access via OJB as stated in :