Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.6.0
-
None
-
None
Description
We've been notified by INFRA that our travis usage is exceedingly high.
There are various things we could look into short- and long term:
Short-term
Reduce number of jobs
We currently run 12 job for each pr/push.
The first 10 jobs belong to 2 groups, with each group representing one test run of Flink against a specific hadoop version.
Given that the majority of changes made to Flink do not impact our compatibility with hadoop we could drop one of the groups and instead rely on daily cron jobs. This alone would cut our travis usage by 40%.
Once the migration to flip6 is done we can drop the remaining 2 jobs, increasing the reduction to 60%.
Reduce number of builds
Travis is run for every PR, regardless of what change was made, even if it was something trivial as removing a trailing space in a documentation file. From time to time it also happens that new commits are pushed in a PR solely to trigger a new build to get that perfect green build.
Instead we could look into manually triggering travis for pull requests, that is with a bot.
Long-term
Incremental builds
Making the build dependent on the changes made has been brought up a few times now. This would in particular benefit cases where connectors/libraries are modified as they generally have few dependents. We would still have to run everything though if changes are made to the core modules.
Repository split
The most painful of them all, but in my opinion also the most promising. With separate repositories for the core flink modules (flink-runtime etc), flink-connectors and flink-libraries would cut downright skip the compilation for a large number of modules.
Attachments
Issue Links
- incorporates
-
FLINK-13102 Travis build optimization
- Closed
- relates to
-
FLINK-13978 Switch to Azure Pipelines as a CI tool for Flink
- Closed