[HIVE-19784] Regression test selection framework for ptest - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Testing Infrastructure
Labels:
None

Description

Regression test selection is a methodology for decreasing the number of tests that are run in regression test suites. The idea is to that for a given change, only run the tests that are relevant to the given change, rather than all the tests.

For example, right now Hive QA runs all the standalone-metastore tests for every patch. However, most of the time this isn't necessary. If a patch is only modifying files in ql or common there is no need to run standalone-metastore tests as there is no dependency from the standalone-metastore to any other Hive module (exception for storage-api).

RTS is commonly used for CI systems. Google has published some interesting info on how they do this

http://google-engtools.blogspot.com/2011/06/testing-at-speed-and-scale-of-google.html
https://drive.google.com/file/d/0Bx-FLr0Egz9zYXJfMEZ6NERTbkU/view
Bazel seems to provide some functionality to do this: http://code.hootsuite.com/faster-automated-tests-bazel/

There are a few other open-source projects that offer different ways of doing this: Ekstazi

A short term solution would be to implement the following:

Before each Hive QA, parse the Maven dependency graph
Take the specified patch and check which Maven modules it modifies
Runs tests contained inside the modified modules and their dependent modules

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Sahil Takiar

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Jun/18 14:43

Updated:: 04/Jun/18 16:14