Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Motivation
There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invokes in DistributionZoneManager#createOrRestoreZoneState:
- DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to init the default zone.
- DistributionZoneManager#restoreTimers in case when a filter update was handled before DZM stop, but it didn't update data nodes.
Futures of these invokes are ignored. So after the start method is completed actually not all start actions are completed. It can lead to the following situation:
- Initialisation of the default zone is hanged for some reason even after full restart of the cluster.
- That means that all data nodes related keys in metastorage haven't been initialised.
- For example, if user add some new node, and scale up timer is immediate, which leads to immediate data nodes recalculation, this recalculation won't happen, because data nodes key have not been initialised.
Possible solutions
Easier
We just need to wait for all async logic to be completed within the DistributionZoneManager#start with ms.invoke().get()
Harder
We can enhance IgniteComponent#start, so it could return Completable future, and after that we need to change the flow of starting components, so node is not ready to work until all IgniteComponent#start futures are completed. For example, we can chain our futures on IgniteImpl#recoverComponentsStateOnStart, so components' futures are completed before metaStorageMgr.deployWatches().
In DistributionZoneManager#start we can return CompletableFuture.allOf features, that are needed to be completed in the DistributionZoneManager#start
Definition of done
All asynchronous logic in the DistributionZoneManager#start is done before a node is ready to work, in particular, ready to interact with zones.
UPD:
We decided to implement the easier way, the harder will be implemented in the separate ticket https://issues.apache.org/jira/browse/IGNITE-20477
Attachments
Issue Links
- is a child of
-
IGNITE-17924 Core distributions zones functionality.
- Open
- is blocked by
-
IGNITE-20477 Async component start
- Resolved
- is related to
-
IGNITE-20317 Meta storage invokes are not completed when events are handled in DZM
- Resolved
- links to