[IGNITE-3195] Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.8
Component/s: cache
Labels:
- iep-16

Release Note:
Merged to the master branch.
Docs Text:
Defaults changes to threads = 4, prefetch = 3
Ignite Flags:

Docs Required

Description

Presently it's considered that the maximum number of threads that has to process all demand and supply messages coming from all the nodes must not be bigger than IgniteConfiguration.rebalanceThreadPoolSize.

Current implementation relies on ordered messages functionality creating a number of topics equal to IgniteConfiguration.rebalanceThreadPoolSize.

However, the implementation doesn't take into account that ordered messages, that correspond to a particular topic, are processed in parallel for different nodes. Refer to the implementation of GridIoManager.processOrderedMessage to see that for every topic there will be a unique GridCommunicationMessageSet for every node.

Also to prove that this is true you can refer to this execution stack

java.lang.RuntimeException: HAPPENED DEMAND
	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378)
	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364)
	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
	at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81)
	at org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125)
	at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
	at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105)
	at org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456)
	at org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179)
	at org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105)
	at org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

All this means that in fact the number of threads that will be busy with replication activity will be equal to IgniteConfiguration.rebalanceThreadPoolSize x number_of_nodes_participated_in_rebalancing

Attachments

Issue Links

causes

IGNITE-12252 Unchecked exceptions during rebalancing should be handled

Resolved

IGNITE-12117 Historical rebalance should NOT be processed in striped way

Open

fixes

IGNITE-11862 Cache stopping on supplier during rebalance causes NPE and supplying failure.

Resolved

links to

GitHub Pull Request #6688

Activity

People

Assignee:: Anton Vinogradov (Obsolete, actual is "av")

Reporter:: Denis A. Magda

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 25/May/16 12:34

Updated:: 13/Apr/20 18:13

Resolved:: 28/Aug/19 10:30

Time Tracking

Estimated:

Not Specified

Remaining:

Logged: