Uploaded image for project: 'Karaf'
  1. Karaf
  2. KARAF-5562

Improve cellar groups configuration synchronisation from hazelcast

    XMLWordPrintableJSON

Details

    Description

      We encountered different issues due to HazelcastGroupManager, I'm grouping them here as all of them are linked and we fixed them in a single refactoring of the class. This globally result in a better synchronization of the cellar groups configuration.

      • Hazelcast network splits can result in very bad behaviour on the “groups” shared map - this map contains the list of groups and its members, and the system fully rely on it to know in which groups you are. If multiple nodes updates the map while they are not connected together (easy to reproduce by starting both nodes at the same time), and then join afterwards, the default merge algorithm is applied and simply overwrite the full map. This basically result in groups loosing members, even if the configuration file claims that the nodes are still members.
      • When handling the groups configuration, HazelcastGroupManager replicates the felix.fileinstall.filename property on each node, containing the configuration file path. It’s quite “ok” if you’re on a cluster with each node installed on the exact same path - however if you’re on the same machine, with 2 nodes on different paths : one node will at one point write on the config file of the other node and never updates its own config, which can be quite confusing.
      • The HazelcastGroupManager can start even when a configuration is not detected by fileinstall yet - it then creates a new config, based on the hazelcast shared config, which will override the config file when fileinstall detects it. It does not have a huge impact, but it shuffles the properties files and makes it unreadable.
      • The updates from hazelcast to local config trigger back update on hazelcast which goes back to local config and sometimes revert the changes, resulting in no change in the config. Basically , when adding a group, a lot of properties are updated - for each of them we trigger a configuration update. Each configuration update triggers an event which send the whole config back to hazelcast, including properties that are not updated yet, setting them back to their old values. All events (hazelcast updates and osgi config) are treated asynchronously - depending on the orders of events, some properties can be reverted or never added (usually groups property is always reverted after a group add).

      Attachments

        Issue Links

          Activity

            People

              jbonofre Jean-Baptiste Onofré
              draier Thomas Draier
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: