Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Motivation
If new logical topology has a new nodes and nodes that left cluster then DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the same entry in topologyAugmentationMap. So scale up puts entry with some revision and then scale down puts entry with the same revision as key.
The issue is reproduced by DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
Definition of Done
- Concurrency bug is fixed.
- Test is enabled.
UPD:
The problem in general could be reproducible in very rare case, namely in the scenario, when we have received LogicalTopologyEventListener#onTopologyLeap and there were added and removed nodes in this Topology comparing with the topology from metastorage.
The solution is to change representation of the DistributionZoneManager.ZoneState#topologyAugmentationMap.
We have
private static class Augmentation { /** Names of the node. */ Set<NodeWithAttributes> nodes; /** Flag that indicates whether {@code nodeNames} should be added or removed. */ boolean addition; Augmentation(Set<NodeWithAttributes> nodes, boolean addition) { this.nodes = nodes; this.addition = addition; } }
I suggest to store flag addition in the NodeWithAttributes, so we could have different types of node in terms of added or removed node for a revision in the DistributionZoneManager.ZoneState#topologyAugmentationMap.
Attachments
Issue Links
- is a child of
-
IGNITE-17924 Core distributions zones functionality.
- Open