[SPARK-1455] Determine which test suites to run based on code changes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.2.0
Component/s: Project Infra
Labels:
None

Description

Right now we run the entire set of tests for every change. This means the tests take a long time. Our pull request builder checks out the merge branch from git, so we could do a diff and figure out what source files were changed, and run a more isolated set of tests. We should just run tests in a way that reflects the inter-dependencies of the project. E.g:

If Spark core is modified, we should run all tests
If just SQL is modified, we should run only the SQL tests
If just Streaming is modified, we should run only the streaming tests
If just Pyspark is modified, we only run the PySpark tests.

And so on. I think this would reduce the RTT of the tests a lot and it should be pretty easy to accomplish with some scripting foo.

Attachments

Issue Links

is duplicated by

SPARK-3534 Avoid running MLlib and Streaming tests when testing SQL PRs

Resolved

links to

[Github] Pull Request #2420 (nchammas)

Activity

People

Assignee:: Unassigned

Reporter:: Patrick Wendell

Votes:: 2 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 09/Apr/14 19:13

Updated:: 18/Sep/14 23:49

Resolved:: 18/Sep/14 23:45