Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-17930

Ensure CircleCI and ASF Jenkins CI are aligned

Details

    • Task
    • Status: In Progress
    • Normal
    • Resolution: Unresolved
    • 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
    • CI
    • None
    • Quality Assurance
    • Normal
    • All
    • None

    Description

      As discussed in this thread the Cassandra community wants to see CircleCI and ASF CI being aligned - running the same tests, configurations, all tests.

      Exceptions: packaging.

      A few examples of discrepancies we already noticed:

      • utests_system_keyspace_directory run only in CircleCI - CASSANDRA-17145
      • dtest-large run only in Jenkins
      • simulator tests run only in CircleCI
      • In a quick skim I think I didn't see these runs too in CircleCI - dtest-offheap, dtest-large, dtest-large-novnode
      • packaging is also only tested in CircleCI as far as I recall but that is deterministic and we will rely on Jenkins about that. 

      And these are only a few examples on top of my mind. I am sure we will find more. We also need to verify the way we call those tests is correct and matches in both CIs. (I was looking to solve similar discrepancy in CASSANDRA-17912)

      Some info on our tests suites here - https://cassandra.apache.org/_/development/testing.html,

       cassandra-builds repo where our test images reside and the Jenkins build scripts, which I already referred to. 

      CircleCI info can be found in the readme which resides in the in-tree folder dedicated to configuration and scripts for Circle CI - https://github.com/apache/cassandra/tree/trunk/.circleci

      EDIT: More findings:

      •  burn tests missing in CircleIC
      • cqlshlib in 3.11.
      • 4.0+ we also need in-jvm JDK8 with 11, cqlshlib JDK8 with 11.

      EDIT: Java distributed tests running with vnodes should be added to Jenkins

       

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            jmckenzie Josh McKenzie added a comment -

            How would you feel about splitting this up into 2 tasks - 1 being "get Circle to run all necessary tests to validate a release" and the other being "bring ASF CI into parity with Circle"?

            jmckenzie Josh McKenzie added a comment - How would you feel about splitting this up into 2 tasks - 1 being "get Circle to run all necessary tests to validate a release" and the other being "bring ASF CI into parity with Circle"?
            e.dimitrova Ekaterina Dimitrova added a comment - - edited

            I think they overlap but I think we can do the work in subtasks (as small or as big as people feel comfortable and they think that much time they can dedicate to that), this one being the final goal. Thus also someone else interested can join the effort maybe? 

            Like there is  CASSANDRA-17145 for example

            e.dimitrova Ekaterina Dimitrova added a comment - - edited I think they overlap but I think we can do the work in subtasks (as small or as big as people feel comfortable and they think that much time they can dedicate to that), this one being the final goal. Thus also someone else interested can join the effort maybe?  Like there is   CASSANDRA-17145  for example
            e.dimitrova Ekaterina Dimitrova added a comment - - edited

            One idea on my mind is first to review and sanitize the plan - what is needed. And then the actual work done in one, two or more as dchenbecker see fit tasks. But I would leave it to him to decide after he gets acquainted with the environment and the scope of work. 

            e.dimitrova Ekaterina Dimitrova added a comment - - edited One idea on my mind is first to review and sanitize the plan - what is needed. And then the actual work done in one, two or more as dchenbecker  see fit tasks. But I would leave it to him to decide after he gets acquainted with the environment and the scope of work. 
            jmckenzie Josh McKenzie added a comment -

            Either way - no need to really make it cumbersome, I just want to optimize for unblocking our release process based on whatever we land on. If we end up going w/circle I'd advocate for us doing that first, if ASF, that, etc.

            jmckenzie Josh McKenzie added a comment - Either way - no need to really make it cumbersome, I just want to optimize for unblocking our release process based on whatever we land on. If we end up going w/circle I'd advocate for us doing that first, if ASF, that, etc.

            Splitting into 2 tasks seems reasonable. I'm still coming up to speed on the whole CI system and how it's configured, but it seems like getting CircleCI to run all necessary tests would be the higher priority, right?

            dchenbecker Derek Chen-Becker added a comment - Splitting into 2 tasks seems reasonable. I'm still coming up to speed on the whole CI system and how it's configured, but it seems like getting CircleCI to run all necessary tests would be the higher priority, right?

            Sounds right to me. Also, it is good to see whether we run them in the right way and same way and number of tests run are correct. 

            As I mentioned in Slack - I am currently looking into some weird differences between Python upgrade tests run in Circle CI and Jenkins for 3.0 and 3.11.

            We had there more than 10 failures which turned to be just noise and we have some mixed config I am still hunting in CASSANDRA-17912

            Probably the easiest for you will be to approach it suite by suite

            e.dimitrova Ekaterina Dimitrova added a comment - Sounds right to me. Also, it is good to see whether we run them in the right way and same way and number of tests run are correct.  As I mentioned in Slack - I am currently looking into some weird differences between Python upgrade tests run in Circle CI and Jenkins for 3.0 and 3.11. We had there more than 10 failures which turned to be just noise and we have some mixed config I am still hunting in CASSANDRA-17912 Probably the easiest for you will be to approach it suite by suite

            OK, I think I'm at least familiar enough with both the Jenkins and Circle build setup scripts to start with some observations:

            • An ideal outcome (albeit longer-term) would be a single test configuration source that could drive generation of both systems
            • The Circle and Jenkins configurations are in separate repositories, which adds a little complexity to that task
            • In the short term, some auditing automation would be helpful
            • Naming appears to differ for the same set of tests between Circle and Jenkins. However, this may just be me misunderstanding how either DSL works. Some examples:
              • j8 (Circle) vs jdk_1.8_latest (Jenkins)
              • with-vnode (Circle) vs nothing in Jenkins for vnode tests as far as I can tell
              • no-vnodes (Circle) vs novnode (Jenkins)

            Is the naming tied to the CI somehow, or could we revise the naming to align across both? That would at least help with auditing, and I can do it as part of adding some of the missing tests to either side. Given prior discussion I can focus on getting the tests missing in Circle if that makes sense

            dchenbecker Derek Chen-Becker added a comment - OK, I think I'm at least familiar enough with both the Jenkins and Circle build setup scripts to start with some observations: An ideal outcome (albeit longer-term) would be a single test configuration source that could drive generation of both systems The Circle and Jenkins configurations are in separate repositories, which adds a little complexity to that task In the short term, some auditing automation would be helpful Naming appears to differ for the same set of tests between Circle and Jenkins. However, this may just be me misunderstanding how either DSL works. Some examples: j8 (Circle) vs jdk_1.8_latest (Jenkins) with-vnode (Circle) vs nothing in Jenkins for vnode tests as far as I can tell no-vnodes (Circle) vs novnode (Jenkins) Is the naming tied to the CI somehow, or could we revise the naming to align across both? That would at least help with auditing, and I can do it as part of adding some of the missing tests to either side. Given prior discussion I can focus on getting the tests missing in Circle if that makes sense
            e.dimitrova Ekaterina Dimitrova added a comment - - edited
            • An ideal outcome (albeit longer-term) would be a single test configuration source that could drive generation of both systems

            If you say one config for both - they are different software with different config. In that sense we already have them using the same testing images, splitting tests (CircleCI has its own way but similar outcome), pointing to the same targets and running tests with the same parameters. So they are equal under the hood. Now, you will say why this ticket then? Because in time not all suites were added everywhere and sometimes people did changes to one without realizing that the other one needs it too. We need to fill that gap. I don't see how we can easily manage the two from one place, also it will be a source of bugs. Now at least we verify the two against each other. And we can do incremental changes

            • The Circle and Jenkins configurations are in separate repositories, which adds a little complexity to that task

            Check previous answer  

            • In the short term, some auditing automation would be helpful

            Do you have anything particular in mind?

            Naming appears to differ for the same set of tests between Circle and Jenkins

            I think those not marked for novnode are with vnode but I agree it might be confusing or not immediately obvious for someone new.

            I guess we can do something about that but in a separate ticket. While you think It is "just" small name change, talking from experience, it is never just and leads to a chain of other changes. Small incremental changes are always better when talking about CI and infra.  

            Is the naming tied to the CI somehow, or could we revise the naming to align across both? That would at least help with auditing, and I can do it as part of adding some of the missing tests to either side. Given prior discussion I can focus on getting the tests missing in Circle if that makes sense

            I'd say please check the tests by looking into ant targets, job names for example and do not change naming at this point. 

            I guess now when you know what vnode/no vnode is in each CI and j8 and j11 are easy to recognize, that should not be a problem. But please, do let me know if I am missing something or if something I point to is not clear.

            Thanks

            CC mck for awareness  

            e.dimitrova Ekaterina Dimitrova added a comment - - edited An ideal outcome (albeit longer-term) would be a single test configuration source that could drive generation of both systems If you say one config for both - they are different software with different config. In that sense we already have them using the same testing images, splitting tests (CircleCI has its own way but similar outcome), pointing to the same targets and running tests with the same parameters. So they are equal under the hood. Now, you will say why this ticket then? Because in time not all suites were added everywhere and sometimes people did changes to one without realizing that the other one needs it too. We need to fill that gap. I don't see how we can easily manage the two from one place, also it will be a source of bugs. Now at least we verify the two against each other. And we can do incremental changes The Circle and Jenkins configurations are in separate repositories, which adds a little complexity to that task Check previous answer   In the short term, some auditing automation would be helpful Do you have anything particular in mind? Naming appears to differ for the same set of tests between Circle and Jenkins I think those not marked for novnode are with vnode but I agree it might be confusing or not immediately obvious for someone new. I guess we can do something about that but in a separate ticket. While you think It is "just" small name change, talking from experience, it is never just and leads to a chain of other changes. Small incremental changes are always better when talking about CI and infra.   Is the naming tied to the CI somehow, or could we revise the naming to align across both? That would at least help with auditing, and I can do it as part of adding some of the missing tests to either side. Given prior discussion I can focus on getting the tests missing in Circle if that makes sense I'd say please check the tests by looking into ant targets, job names for example and do not change naming at this point.  I guess now when you know what vnode/no vnode is in each CI and j8 and j11 are easy to recognize, that should not be a problem. But please, do let me know if I am missing something or if something I point to is not clear. Thanks CC mck  for awareness  

            I think there's material evidence (e.g. this ticket) that the current approach has problems. In the short term, I think we need to just manually turn the crank to fix up what we can find. What I mean by a long-term driven by a single config is it would be ideal to have a (simple) way to define a test in one place, and be able to generate the configurations for CircleCI, Jenkins, and whatever other CI system we decide to use from that canonical source. I'm not saying that this ticket is where this happens, I'm just saying a system where you rely on good intentions to ensure parity is statistically unlikely to maintain that parity over time.

            As for naming, I'm fine taking small incremental steps to fix them, but I do think they need fixed. I don't see any value in having the names not be the same between the two systems. If changing the name of a test doesn't otherwise change or break its behavior, do you have an objection?

            dchenbecker Derek Chen-Becker added a comment - I think there's material evidence (e.g. this ticket) that the current approach has problems. In the short term, I think we need to just manually turn the crank to fix up what we can find. What I mean by a long-term driven by a single config is it would be ideal to have a (simple) way to define a test in one place, and be able to generate the configurations for CircleCI, Jenkins, and whatever other CI system we decide to use from that canonical source. I'm not saying that this ticket is where this happens, I'm just saying a system where you rely on good intentions to ensure parity is statistically unlikely to maintain that parity over time. As for naming, I'm fine taking small incremental steps to fix them, but I do think they need fixed. I don't see any value in having the names not be the same between the two systems. If changing the name of a test doesn't otherwise change or break its behavior, do you have an objection?

             I just linked CASSANDRA-17671 which I just found in my list, opened long ago but never got to it. I will unassign it as I won't be able to work on it soon I think if you or anyone else have cycles to deal with it. With that said, I will be off until 18th October, I will be happy to get back to these problems when I am back. 

            What I mean by a long-term driven by a single config is it would be ideal to have a (simple) way to define a test in one place, and be able to generate the configurations for CircleCI, Jenkins, and whatever other CI system we decide to use from that canonical source. I'm not saying that this ticket is where this happens, I'm just saying a system where you rely on good intentions to ensure parity is statistically unlikely to maintain that parity over time.

            Happy to see a written proposal if you are willing to work on such an implementation after we fill the short-term goals here. I am sure there are areas of improvement and that every setup evolves in time but also needs someone with time to be able to sit and work hard on it. So with that said happy to see you are interested in this area. 

             If changing the name of a test doesn't otherwise change or break its behavior, do you have an objection?

            I personally do not have any objections to bring things to aligned names where it makes sense. My request was just such an improvement to be in a separate ticket. Thanks for looking into that. 

            e.dimitrova Ekaterina Dimitrova added a comment -  I just linked CASSANDRA-17671 which I just found in my list, opened long ago but never got to it. I will unassign it as I won't be able to work on it soon I think if you or anyone else have cycles to deal with it. With that said, I will be off until 18th October, I will be happy to get back to these problems when I am back.  What I mean by a long-term driven by a single config is it would be ideal to have a (simple) way to define a test in one place, and be able to generate the configurations for CircleCI, Jenkins, and whatever other CI system we decide to use from that canonical source. I'm not saying that this ticket is where this happens, I'm just saying a system where you rely on good intentions to ensure parity is statistically unlikely to maintain that parity over time. Happy to see a written proposal if you are willing to work on such an implementation after we fill the short-term goals here. I am sure there are areas of improvement and that every setup evolves in time but also needs someone with time to be able to sit and work hard on it. So with that said happy to see you are interested in this area.   If changing the name of a test doesn't otherwise change or break its behavior, do you have an objection? I personally do not have any objections to bring things to aligned names where it makes sense. My request was just such an improvement to be in a separate ticket. Thanks for looking into that. 
            jmckenzie Josh McKenzie added a comment -

            CASSANDRA-17939 has some follow-up tickets that might intersect w/what we're looking at here: see my last comment.

            May end up needing a CI epic to roll all this up under; seem to be quite a few moving parts.

            jmckenzie Josh McKenzie added a comment - CASSANDRA-17939 has some follow-up tickets that might intersect w/what we're looking at here: see my last comment. May end up needing a CI epic to roll all this up under; seem to be quite a few moving parts.
            e.dimitrova Ekaterina Dimitrova added a comment - - edited

            May end up needing a CI epic to roll all this up under; seem to be quite a few moving parts.

            Might be not a bad idea so we can have a consolidated view and visibility, just we need people to work on all this after that  Hopefully one epic will get people's attention and more people will join the effort. There are plenty of small and big stuff, room for everyone to chime in around our CI and Infra  

            e.dimitrova Ekaterina Dimitrova added a comment - - edited May end up needing a CI epic to roll all this up under; seem to be quite a few moving parts. Might be not a bad idea so we can have a consolidated view and visibility, just we need people to work on all this after that   Hopefully one epic will get people's attention and more people will join the effort. There are plenty of small and big stuff, room for everyone to chime in around our CI and Infra   

            The Circle and Jenkins configurations are in separate repositories, which adds a little complexity to that task

            It would be great to start to unify the script and configurations by a) bringing cassandra-builds/build-scripts/ files one-by-one to cassandra/.build/ and b) having circleci use them too. There's some challenges here to do with the different environments, but if we start at the lowest level of build scripts I believe there is lhf for standardisation.

            Having the build scripts in .build/ helps us with other overhead like docker images and JDK versions support over different branches.

            mck Michael Semb Wever added a comment - The Circle and Jenkins configurations are in separate repositories, which adds a little complexity to that task It would be great to start to unify the script and configurations by a) bringing cassandra-builds/build-scripts/ files one-by-one to cassandra/.build/ and b) having circleci use them too. There's some challenges here to do with the different environments, but if we start at the lowest level of build scripts I believe there is lhf for standardisation. Having the build scripts in .build/ helps us with other overhead like docker images and JDK versions support over different branches.
            jmckenzie Josh McKenzie added a comment -

            My plan is once we have the epic + all tickets, then I can start making noise w/the biweekly and we chat about what's going on on dev ML and slack as well, get folks involved.

            Not a bad place for new contributors to get involved either fwiw.

            jmckenzie Josh McKenzie added a comment - My plan is once we have the epic + all tickets, then I can start making noise w/the biweekly and we chat about what's going on on dev ML and slack as well, get folks involved. Not a bad place for new contributors to get involved either fwiw.

            While looking into CASSANDRA-17987. I also noticed we will need a ticket to run also the burn tests. I will open a ticket before I forget...also cqlshlib in 3.11. 4.0+ we also need in-jvm JDK8 with 11, cqlshlib JDK8 with 11. I will add those to the ticket description. 

            e.dimitrova Ekaterina Dimitrova added a comment - While looking into CASSANDRA-17987 . I also noticed we will need a ticket to run also the burn tests. I will open a ticket before I forget...also cqlshlib in 3.11. 4.0+ we also need in-jvm JDK8 with 11, cqlshlib JDK8 with 11. I will add those to the ticket description. 

            Not a bad place for new contributors to get involved either fwiw.

            Not at all and it has a huge impact on the project.

            e.dimitrova Ekaterina Dimitrova added a comment - Not a bad place for new contributors to get involved either fwiw. Not at all and it has a huge impact on the project.
            e.dimitrova Ekaterina Dimitrova added a comment - - edited

            Reassigned the ticket after a conversation with dchenbecker . He will be looking into the improvements proposal he is doing and I will finish the rest of the work here around CircleCI.

            Despite the already linked tickets:
            CASSANDRA-17989CASSANDRA-17671, CASSANDRA-17987 , CASSANDRA-17912 and CASSANDRA-17950, I did a check on what other tests we miss:

            • Burn tests
            • large Python DTests (with and without vnodes) miss on all branches.
            • CQLSHLIB with JDK11.
            • JVM distributed tests on J8/J11.

            I will open a ticket for that last part in a bit and link it here.

            I also identified a minor issue on our testing webpage

            The tests missing in Jenkins are the simulator tests and the keyspaces one (there is already a ticket).

            EDIT: CASSANDRA-18001 opened for the rest of the CircleCI tests and  CASSANDRA-18003 was opened for the simulator tests to be added to Jenkins

             

            e.dimitrova Ekaterina Dimitrova added a comment - - edited Reassigned the ticket after a conversation with dchenbecker  . He will be looking into the improvements proposal he is doing and I will finish the rest of the work here around CircleCI. Despite the already linked tickets: CASSANDRA-17989 ,  CASSANDRA-17671 , CASSANDRA-17987  , CASSANDRA-17912  and  CASSANDRA-17950 , I did a check on what other tests we miss: Burn tests large Python DTests (with and without vnodes) miss on all branches. CQLSHLIB with JDK11. JVM distributed tests on J8/J11. I will open a ticket for that last part in a bit and link it here. I also identified a minor issue on our testing webpage .  The tests missing in Jenkins are the simulator tests and the keyspaces one (there is already a ticket). EDIT: CASSANDRA-18001 opened for the rest of the CircleCI tests and  CASSANDRA-18003  was opened for the simulator tests to be added to Jenkins  

            mck and brandon.williams both pointed that Jenkins does not have jobs for the different python tests as CircleCI does. I just opened a ticket for that too. - CASSANDRA-18101

            We also definitely need to parameterize better the config but the goal of the current tickets was to bring all missing jobs in and unblock 4.1. Re-writing our strategy (switching to free and paid config and getting rid of the patch files) and improvements should be part of CASSANDRA-18011, CASSANDRA-18012, CASSANDRA-17600

            e.dimitrova Ekaterina Dimitrova added a comment - mck  and brandon.williams  both pointed that Jenkins does not have jobs for the different python tests as CircleCI does. I just opened a ticket for that too. - CASSANDRA-18101 We also definitely need to parameterize better the config but the goal of the current tickets was to bring all missing jobs in and unblock 4.1. Re-writing our strategy (switching to free and paid config and getting rid of the patch files) and improvements should be part of CASSANDRA-18011 , CASSANDRA-18012 , CASSANDRA-17600

            Also, I will unassign this ticket to signal we need someone to take over on Jenkins. I will finish the few things we left for post 4.1 for CircleCI and I will have to step away from CI matters for a bit. If anyone has cycles - please feel free to take over Jenkins matters. 

            e.dimitrova Ekaterina Dimitrova added a comment - Also, I will unassign this ticket to signal we need someone to take over on Jenkins. I will finish the few things we left for post 4.1 for CircleCI and I will have to step away from CI matters for a bit. If anyone has cycles - please feel free to take over Jenkins matters. 

            Adding a note as per e.dimitrova's suggestion to make sure there is parity on 'oa' tests. See CASSANDRA-18301 and CASSANDRA-14227 for context

            bereng Berenguer Blasi added a comment - Adding a note as per e.dimitrova 's suggestion to make sure there is parity on 'oa' tests. See CASSANDRA-18301 and CASSANDRA-14227 for context

            People

              Unassigned Unassigned
              e.dimitrova Ekaterina Dimitrova
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3.5h
                  3.5h