[AURORA-1847] Eliminate sequential scan in MemTaskStore.getJobKeys() - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Story
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.17.0
Component/s: Efficiency, UI
Labels:
- newbie

Epic Link:
Scheduler Performance Improvement

Description

The existing TaskStoreBenchmarks shows DBTaskStore is almost two orders of magnitude faster than MemTaskStore when it comes to getJobKeys():

Benchmark                                       (numTasks)   Mode  Cnt       Score       Error  Units
TaskStoreBenchmarks.DBFetchTasksBenchmark.run        10000  thrpt    5  320271.082 ± 30842.727  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run        50000  thrpt    5  334805.551 ± 20435.139  ops/s
TaskStoreBenchmarks.DBFetchTasksBenchmark.run       100000  thrpt    5  317395.890 ± 45302.180  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run       10000  thrpt    5     624.944 ±    54.038  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run       50000  thrpt    5      91.335 ±     9.241  ops/s
TaskStoreBenchmarks.MemFetchTasksBenchmark.run      100000  thrpt    5      27.712 ±     8.128  ops/s

If scheduler is configured to run with the MemTaskStore every hit on scheduler page (/scheduler) causes a call to MemTaskStore.getJobKeys().

The implementation of this method is currently very inefficient as it results in a sequential scan of the task store and then mapping to their respective job keys. The sequential scan and mapping to job key can be eliminated by simply returning the key set of the existing secondary index job.

Attachments

Activity

People

Assignee:: Mehrdad Nurolahzade

Reporter:: Mehrdad Nurolahzade

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 06/Dec/16 22:31

Updated:: 11/Jan/17 22:22

Resolved:: 11/Jan/17 22:22