Attached is 'multiColumnBenchmark, an enhanced version of the GroupByClient proposal
which can now generate a richer variety of GROUP BY statements.
It also only executes a single statement per run, since I agree with the
observation that it is hard to interpret the results of running a mixture
of statements in the same run.
I put a lot of comments into the GroupByClient header which should explain
how to invoke the benchmark to run a richer set of statements.
I gave getLoadOpt package visibility so that the GroupByClient could
interrogate the -load_opts settings in a more convenient fashion.
Continued suggestions and comments would be greatly appreciated.
Soon, I hope to find the time to run this benchmark against the current trunk,
as well as against the
DERBY-3002 patch proposal, to get a first set of
numbers to explore the overall performance characteristics in a coarse fashion.
I'm hoping it will be sufficient to perform, say, 5 different GROUP BY statements
against each version of the code, at scales 10 thousand, 100 thousand, and
250 thousand rows. That will give us 15 numbers for each branch, and
maybe we can see some results from that data.
I should be able to post those runs as a "script" of 18 perf.clients.Runner statements
to be run in sequence against each code branch.