Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
In some cases when we starts multiple caches (over 2K caches), we can get a stop on exchange when new node joining to the cluster.
Coordinator-node wait to receive a single message from all other nodes, but last node (which want to joining to the cluster) stopped on starting caches:
Stack trace at java.lang.Thread.dumpStack(Thread.java:1329) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCache(GridCacheProcessor.java:1159) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1900) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCachesOnLocalJoin(GridCacheProcessor.java:1764) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCachesOnLocalJoin(GridDhtPartitionsExchangeFuture.java:740) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:622) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2329) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745)
It blocks cluster exchange process until all caches started on the last node.
We should start caches in parallel threads or exclude the action from exchange init process.
Attachments
Issue Links
- causes
-
IGNITE-9729 Ability to start GridQueryProcessor in parallel
- Open
- is blocked by
-
IGNITE-5795 Binary metadata is not registered during start of cache
- Resolved
- is related to
-
IGNITE-10228 Start multiple caches in parallel may lead to the fact that some of the caches won't be registered.
- Resolved
- links to