Author: Joe McDonnell <email@example.com>
Date: Tue Mar 7 12:20:09 2017 -0800
IMPALA-5039: Fix variability in parquet dictionary filtering test
The tests for dictionary filtering look at how many row groups are
processed and how many are filtered by matching text in the profile.
However, the number of row groups processed and filtered by any
individual fragment depends on how the work is split and how many
impalads are running. This causes variability in the test output.
To fix this, the test needs a way to aggregate the results across
fragments. This fix introduces the following syntax for specifying
aggregate(function_name, field_name): expected_value
This searches the runtime profile for lines that contain
'field_name: number'. It skips the averaged fragment, as this is
derived from all the other fragments.
Currently, only SUM is implemented, and the expected_value is
required to be an integer. It should be easy to implement other
interesting functions like COUNT and MIN/MAX. It would also be
possible to extend it to floats.
Switching the dictionary filtering tests over to this new syntax
eliminates the variability in the tests.
Tested-by: Impala Public Jenkins
Reviewed-by: Alex Behm <firstname.lastname@example.org>