Details
Description
Presently it's considered that the maximum number of threads that has to process all demand and supply messages coming from all the nodes must not be bigger than IgniteConfiguration.rebalanceThreadPoolSize.
Current implementation relies on ordered messages functionality creating a number of topics equal to IgniteConfiguration.rebalanceThreadPoolSize.
However, the implementation doesn't take into account that ordered messages, that correspond to a particular topic, are processed in parallel for different nodes. Refer to the implementation of GridIoManager.processOrderedMessage to see that for every topic there will be a unique GridCommunicationMessageSet for every node.
Also to prove that this is true you can refer to this execution stack
java.lang.RuntimeException: HAPPENED DEMAND at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) at org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) at org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
All this means that in fact the number of threads that will be busy with replication activity will be equal to IgniteConfiguration.rebalanceThreadPoolSize x number_of_nodes_participated_in_rebalancing
Attachments
Issue Links
- causes
-
IGNITE-12252 Unchecked exceptions during rebalancing should be handled
- Resolved
-
IGNITE-12117 Historical rebalance should NOT be processed in striped way
- Open
- fixes
-
IGNITE-11862 Cache stopping on supplier during rebalance causes NPE and supplying failure.
- Resolved
- links to