Details
-
Story
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
The existing TaskStoreBenchmarks shows DBTaskStore is almost two orders of magnitude faster than MemTaskStore when it comes to getJobKeys():
Benchmark (numTasks) Mode Cnt Score Error Units TaskStoreBenchmarks.DBFetchTasksBenchmark.run 10000 thrpt 5 320271.082 ± 30842.727 ops/s TaskStoreBenchmarks.DBFetchTasksBenchmark.run 50000 thrpt 5 334805.551 ± 20435.139 ops/s TaskStoreBenchmarks.DBFetchTasksBenchmark.run 100000 thrpt 5 317395.890 ± 45302.180 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run 10000 thrpt 5 624.944 ± 54.038 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run 50000 thrpt 5 91.335 ± 9.241 ops/s TaskStoreBenchmarks.MemFetchTasksBenchmark.run 100000 thrpt 5 27.712 ± 8.128 ops/s
If scheduler is configured to run with the MemTaskStore every hit on scheduler page (/scheduler) causes a call to MemTaskStore.getJobKeys().
The implementation of this method is currently very inefficient as it results in a sequential scan of the task store and then mapping to their respective job keys. The sequential scan and mapping to job key can be eliminated by simply returning the key set of the existing secondary index job.