Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 3.0
-
None
-
None
-
ghx-label-9
Description
It's already possible to specify an arbitrary list of test modules, test classes, and/or test functions as command line arguments when running the Impala mini-cluster tests. It's also possible to opt-out of running specific tests by applying any of a variety of skipif markers.
What we don't have is a comprehensive way for tests to be opted-in to a targeted test run, other than by naming it as a command line argument. This becomes extremely unwieldy beyond a certain number of tests. In fact, we don't have a general concept of targeted test runs at all. The approach to date has been to always run as many tests as possible, except for those tests specifically marked for skipping. This is an OK way to make sure tests don't get overlooked, but it also results in many tests frequently being run in contexts in which they don't necessarily apply, e.g. against S3, or against actual deployed clusters, which can lead to false negatives.
There are different ways that we could group together a disparate array of tests into a targeted run. We could come up with a permanent series of new pytest markers/decorators for opting-in, as opposed to opting-out, of a given test run. An initial pass would then need to be made to apply the new decorators as needed to all of the existing tests. One could then invoke something like "impala-pytest -m cluster_tests" as needed.
Another approach might be to define test runs in special files (probably yaml). The file would include a list of which tests to run, possibly along with other test parameters, e.g. "run this list of tests, but only on parquet, and skip tests that require LZO compression."