Details
-
Improvement
-
Status: Triage Needed
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
All
-
None
Description
About
DataStax is the company associated with Apache’s open-source Cassandra project.
The DataStax team requested a configuration review to do a more modern and general revamp of their pipeline for the 2021 config prepared with the Setup Workflows feature (setup).
GitHub Demo Repo for CircleCI Setup Workflows & Dynamic Configuration scripts here.
Current Configuration & Workflow
The current state of their configuration can be found at the Cassandra open-source GitHub repository.
Because Cassandra works with users contributing to the repo who are not a part of the organization, they expect these users to be able to run tests on CircleCI.
There are three tiers: Medium, Large and X-Large test tiers that users may qualify to run. Right now, the process to create a valid CircleCI configuration to run is complex in that it involves copying the right config to the .circleci/config.yml file based on what you’re attempting to run.
Now: config.yml.LOWRES is the default config. MIDRES and HIGHRES are custom configs for those who have access to premium CircleCI resources.
Overview
Goals and Desired Outcome
The team is looking for ways to modernize their workflow. Ideally, there is one config.yml that will run various workflows based on the parameters provided and contexts allowed to the team member. Updating the config master should be done via GitHub and opening issues on the JIRA Board.
Suggestions
- Currently the process is manual to trigger steps within the build. We'll investigate the best ways to provide a more automotive experience to cut down on developer time.
Solution: setup workflows, dynamic configuration/conditional workflows and restricted contexts
- Easy User Management: The ability for their team to add users through the UI instead of needing to submit a support ticket.
Solution: DevOps Customer Engineering worked with the product team in 2020 to add shared organization management functionality to the UI. On our roadmap for 2021 is to add users via the UI. See
- Tracking Test Results: Tracking test results across runs to get a historical look at build times, flaky builds, etc.
Q&A Pause
- Using the LOWRES config as an example which then can be translated to the other config tiers, jobs with environment variables that could be stored in Contexts as environment variables and restricted or unrestricted based on user team/tier or passed as a pipeline parameters:
- Java 8 JVM Upgrade Distributed Tests (j8_jvm_upgrade_dtests)
- - ANT_HOME: /usr/share/ant
- - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
- - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- - LANG: en_US.UTF-8
- - KEEP_TEST_DIR: true
- - DEFAULT_DIR: /home/cassandra/cassandra-dtest
- - PYTHONIOENCODING: utf-8
- - PYTHONUNBUFFERED: true
- - CASS_DRIVER_NO_EXTENSIONS: true
- - CASS_DRIVER_NO_CYTHON: true
- - CASSANDRA_SKIP_SYNC: true
- - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
- - DTEST_BRANCH: trunk
- - CCM_MAX_HEAP_SIZE: 1024M
- - CCM_HEAP_NEWSIZE: 256M
- - REPEATED_UTEST_TARGET: testsome
- - REPEATED_UTEST_CLASS: null
- - REPEATED_UTEST_METHODS: null
- - REPEATED_UTEST_COUNT: 100
- - REPEATED_UTEST_STOP_ON_FAILURE: false
- - REPEATED_DTEST_NAME: null
- - REPEATED_DTEST_VNODES: false
- - REPEATED_DTEST_COUNT: 100
- - REPEATED_DTEST_STOP_ON_FAILURE: false
- - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- Java 8 cqlsh Distributed Tests python 2 with VNodes [j8_cqlsh-dtests-py2-with-vnodes]
- - ANT_HOME: /usr/share/ant
- - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
- - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- - LANG: en_US.UTF-8
- - KEEP_TEST_DIR: true
- - DEFAULT_DIR: /home/cassandra/cassandra-dtest
- - PYTHONIOENCODING: utf-8
- - PYTHONUNBUFFERED: true
- - CASS_DRIVER_NO_EXTENSIONS: true
- - CASS_DRIVER_NO_CYTHON: true
- - CASSANDRA_SKIP_SYNC: true
- - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
- - DTEST_BRANCH: trunk
- - CCM_MAX_HEAP_SIZE: 1024M
- - CCM_HEAP_NEWSIZE: 256M
- - REPEATED_UTEST_TARGET: testsome
- - REPEATED_UTEST_CLASS: null
- - REPEATED_UTEST_METHODS: null
- - REPEATED_UTEST_COUNT: 100
- - REPEATED_UTEST_STOP_ON_FAILURE: false
- - REPEATED_DTEST_NAME: null
- - REPEATED_DTEST_VNODES: false
- - REPEATED_DTEST_COUNT: 100
- - REPEATED_DTEST_STOP_ON_FAILURE: false
- - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- Java 11 Unit Tests [j11_unit_tests]
- - ANT_HOME: /usr/share/ant
- - JAVA11_HOME: /usr/lib/jvm/java-11-openjdk-amd64
- - JAVA8_HOME: /usr/lib/jvm/java-8-openjdk-amd64
- - LANG: en_US.UTF-8
- - KEEP_TEST_DIR: true
- - DEFAULT_DIR: /home/cassandra/cassandra-dtest
- - PYTHONIOENCODING: utf-8
- - PYTHONUNBUFFERED: true
- - CASS_DRIVER_NO_EXTENSIONS: true
- - CASS_DRIVER_NO_CYTHON: true
- - CASSANDRA_SKIP_SYNC: true
- - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
- - DTEST_BRANCH: trunk
- - CCM_MAX_HEAP_SIZE: 1024M
- - CCM_HEAP_NEWSIZE: 256M
- - REPEATED_UTEST_TARGET: testsome
- - REPEATED_UTEST_CLASS: null
- - REPEATED_UTEST_METHODS: null
- - REPEATED_UTEST_COUNT: 100
- - REPEATED_UTEST_STOP_ON_FAILURE: false
- - REPEATED_DTEST_NAME: null
- - REPEATED_DTEST_VNODES: false
- - REPEATED_DTEST_COUNT: 100
- - REPEATED_DTEST_STOP_ON_FAILURE: false
- - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
- - JDK_HOME: /usr/lib/jvm/java-11-openjdk-amd64
- - CASSANDRA_USE_JDK11: true
- Java 8 cqlsh Distributed Tests python 3.8 NO VNodes [j8_cqlsh-dtests-py38-no-vnodes]
- [etc]
Image, Resource Class, and Executor Choice
A number of executors and resource sizes are defined, including default options. Passing these along as part of a parameter allows for extra flexibility.
We recommend using our API to help get insights. A blog post here gives details on how it can be used, and may allow you some suggestions on how to focus your efforts.
Reference for resources available can be found here. __
Parallelism
There are several jobs where parallelism is enabled and split by timings used in line with our best practice recommendations. Consider increasing parallelism as needed.
Caching Strategies
There are no save_cache or restore_cache steps in your configuration. We normally suggest using a three-layer approach to restoring cache if your team decides to implement this feature. Here is an example from our documentation:
- restore_cache:
keys:
- gem-cache-v1-{{ arch }}{{ .Branch }}{{ checksum "Gemfile.lock" }}
- gem-cache-v1-{{ arch }}-{{ .Branch }}
- gem-cache-v1
This means that it attempts to load the cache in this order:
- A cache that matches the architecture, branch, and specific checksum of the .lock file.
- A cache that matches the architecture and branch, but not the specific checksum of the .lock file.
- A cache that simply matches the name defined, being the most generic option.
This allows branches to have their own caching options more efficiently and may improve performance further. In your case, this would require a bit of reworking. For a simpler but still efficient solution, I would recommend something like the following:
- restore_cache:
Keys:
- v3-npm-checksum "yarn.lock"
- v3-npm
This allows at least some caching to be present even if the yarn file, as a general example, has changed.
Adding an option to restore cache from a more “generic” source might improve performance. Otherwise, everything is handled as per our best practices.
Secrets Management
One way to limit access to running certain workflows is to create contexts that are only available to certain teams. See https://circleci.com/docs/2.0/contexts/#restricting-a-context
If a user does not have access to the context used in the workflow, the workflow will not run.
Orbs
The Cassandra project lends itself to setting up custom Orbs nicely given the amount of scripting within the config itself.
You could aim to have every workflow essentially simply call on an appropriate Orb, this is a very effective way of making use of them. Right now your workflows essentially follow the same steps with only a few minor variations that can be resolved with parameters.
Suggestions for orb creation:
- Determine which unit tests to run
- Log environment information
- Run unit tests
Please see if these work for your config:
- Shallow Clone Orb consideration if needed
- Python Orb and creating your own orb if needed for configure virtualenv and python Dependencies
A list of publicly accessible orbs can be found here. An introduction to the differences between private and public orbs, as well as instructions on how to author an orb, can be found at this link.
Reusable Config Elements
I highly suggest looking into Reusable Config Elements and Orbs based on the repetitive nature of the Cassandra project config as it stands today.
More on using reusable configuration elements can be found here.
Insights
Internal DCE Dashboard - Alert prototyping
Tracking Test Results via UI
Please add the step[ store_test_results|https://circleci.com/docs/2.0/language-python/#upload-and-store-test-results] to your config.yml file
Please reference the Insights homepage on the CircleCI Cloud application once this is completed:
https://app.circleci.com/pipelines/github/apache/cassandra
Useful Links
- Project Optimisations: https://circleci.com/docs/2.0/optimizations/#section=projects
- Monitor and optimize your CI/CD pipeline with Insights from CircleCI
- 2020 State of DevOps Report: Presented by Puppet and CircleCI
Summary
2021 Dynamic Configuration for Cassandra
Our recommendation is to move into using the `setup` workflow key: Dynamic config lets you maintain multiple config.yml files in a single code repository, to selectively identify and execute your primary config.yml files. This feature offers a wide range of powerful capabilities to easily specify and execute a variety of dynamic pipeline workloads. Please note: when using dynamic configuration, a “continue” job from the continuation orb must be called at the end of the setup workflow.
Automated Triggering of Builds
See V2 API
Can trigger build through a cron job.
Tracking Test Results
Insights API
DataStax Build Times: 15-20 minutes builds, also one an hour
Goal was moving it from an hour to 15-20 minutes
Resources
Orb: CircleCI Developer Hub - circleci/continuation
Discuss Post: Dynamic config via setup workflows now available to all users
Documentation: Pipeline Variables