Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Discovery Impl 1.0.4
Description
In our setup, we experience duplicated instances reported in the topology.
The duplicated instance is reported in two different clusters.
One of the duplicated instance contains no properties (when accessing via the Discovery APIs).
This block us from relying on the properties announced by the instances.
Our setup is composed of a set of CRX active/passive clusters as in the diag. below
-> ELB -> CRX active/passive cluster | Dispatcher -> |-> ELB -> CRX active/passive cluster . . . | -> ELB -> CRX active/passive cluster
The discovery service is configured to create a star topology, connecting all instances to a central instance.
All clusters run the same code which embeds org.apache.sling.discovery.impl 1.0.8
The issue may have been introduced in org.apache.sling.discovery.impl 1.0.4 since we did not experience it with previous releases.
In one occurence of the issue, the duplicated instance identifier was: 10b323d0-b59e-4f87-8370-a15aab1bdc24
The server logs contains the trace [0]
we noticed that all clusters contained the structure [1] which seems to be the cause of the duplicate.
The workaround consisting of removing [1] from the repository of all instances removed the duplicated instance from the topology.
We checked that all instances in the topology have a unique sling identifiers (looking in sling.id.file)
We also checked that the structure [1] was not created by a mechanism external to the Sling discovery (e.g. content package or initial content)
[0] (IP, path and properties are edited)
21.05.2014 07:43:06.756 *INFO* [192.168.0.1 [1400658186712] POST /some/service.json HTTP/1.1] org.apache.sling.discovery.impl.topology.TopologyViewImpl addInstance: cannot add same instance twice: an InstanceDescription[slindId=10b323d0-b59e-4f87-8370-a15aab1bdc24, isLeader=false, isOwn=false, clusterViewId=e5df113c-03a8-48bb-9fee-63cf2a8a6ab3, properties={ ... }]
[1] /var/discovery/impl/clusterInstances/10b323d0-b59e-4f87-8370-a15aab1bdc24
Attachments
Issue Links
- relates to
-
SLING-4139 regression: stale topology announcements possible after crash/reconfig
- Closed