Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Story
As a developer, I want to do performance testing on the Path algorithm so that I can understand and communicate scale effects to users.
The proposed matrix for the 1st set of tests is:
1) overall data size, i.e., number of rows in data sets = 1M, 10M, 100M
2) number of partitions = 1k, 10k, 100k
3) number of matches per partition = 1k, 10k, 100k
The proposed matrix for the 2nd set of tests is:
4) match "thickness", i.e., number of rows in match = 1, 1k, 10k
5) number of symbols = 5, 15, 25
Acceptance
1) Please plot performance curves. Do not need to run all permutations to keep the size of the test matrix reasonable.
E.g., when plotting the effect of number of partitions (#2 above), can fix data size at 10M (say) and number of matches per partition to 1k (say).
Other
1) Can use attached data set as a baseline for duplication/fabrication.
2) Another useful data set is at
http://csr.lanl.gov/data/auth/