[IGNITE-12751] callAsync(jobs, rdc) performance degrades quadratically as jobs.size() grows - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.7.6
Fix Version/s: None
Component/s: compute
Labels:
None

Ignite Flags:

Docs Required, Release Notes Required

Description

Please consider attached reproducer and linked report.

{{compute.callAsync(jobs, reducer);
Result [res=33, tookMs=81, jobs=5] //warm up
Result [res=99, tookMs=21, jobs=15]
Result [res=330, tookMs=22, jobs=50]
Result [res=990, tookMs=57, jobs=150]
Result [res=3300, tookMs=146, jobs=500]
Result [res=9900, tookMs=231, jobs=1500]
Result [res=33000, tookMs=840, jobs=5000]
Result [res=99000, tookMs=6965, jobs=15000]
Result [res=330000, tookMs=118394, jobs=50000]}}

As soon jobs.size() grows past 5000, performance begins to degrade quadratically.

I don't expect that it will be completely linear, but I would assume that it should stay linear-ish until size() hits at least 100000, given that we see clusters of 100 nodes and it's not unthinkable to expect 1000 jobs to be run on each node. 5000 jobs (which still give OK performance) / 100 nodes is just 50 jobs per node, which becomes limiting factor.

Linked question also mentions OOM event, which may be caused of intermediate storage of (N^2) data.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

word-count-reproducer.zip
05/Mar/20 11:12
3 kB
Ilya Kasnacheev

Issue Links

links to

StackOverflow question

Activity

People

Assignee:: Unassigned

Reporter:: Ilya Kasnacheev

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 05/Mar/20 11:07

Updated:: 05/Mar/20 11:12