Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: backlog
    • Fix Version/s: backlog
    • Component/s: build, tests
    • Labels:
      None

      Description

      UPDATE : The BigTop CI went down. Lets build a new one. And lets make it immutable.

      OLD CONTENTS : Time to clean up the CI !

      • 24 inactive projects
      • 30+ broken projects
      • Several sandbox projects "test-cluster" "SmokeClusterOld"

      PROPOSAL: we consider creating a new jenkins build master server altogether, which is minimal, and keep http://bigtop01.cloudera.org:8080/ around in the interim?

      1) Add a build hook to github.com/apache/bigtop

      2) Create the following jobs:

      • Runs the smoke tests against a working yarn+hdfs cluster
      • Builds rpms and debs and publishes to maven (on a slave)
      • Builds bigpetstore and publishes to maven (on a slave)
      • Spins up a vagrant bigtop instance and runs a simple smoke test on it (calculate pi), again on a slave.
      • Packages BoxGrinder appliances (not sure if we are still maintaining these, or moving to packer. )?

        Issue Links

          Activity

          Hide
          Konstantin Boudnik added a comment -

          Great idea, Jay! A few comments:

          1. the CI is living on Cloudera's donated AWS VMs. In order to change the layout of the systems we need to have full access to the console. This step is the MUST have: if project PMC can't control the donated infrastructure of the project - we'll have to start looking for different options. Unfortunately. Pinging Cloudera's folks to help us with this: guys, could you please help us to clarify the situation with the resource on your side? Who's controlling it/is in charge of the AWS? Please reach out to me privately - if needed - and I will communicate with these people directly.

          Contingent on the above:

          1. github.com/apache/bigtop is a mirror of ASF git repo. Could you please elaborate what's the role of the hook?
          2. smoke test jobs should be there. They might not be obviously named, as Roman Shaposhnik has an interesting opinion about naming things
          3. linux packages can not be published to Maven. Sorry
          4. does bigpetstore produces any java libraries that are used by projects that are downstream of Bigtop?
          5. if vagrant helps to speed up cluster spin-up - it'd be great!
          6. boxgrinder doesn't seem to be supported well. And looks like it doesn't support Ubuntu. I guess we'd better look for other options. I am not an expert on these sort of things, though.

          BTW, I think order of building packages and running smoke tests steps has to be swapped

          Show
          Konstantin Boudnik added a comment - Great idea, Jay! A few comments: the CI is living on Cloudera's donated AWS VMs. In order to change the layout of the systems we need to have full access to the console. This step is the MUST have: if project PMC can't control the donated infrastructure of the project - we'll have to start looking for different options. Unfortunately. Pinging Cloudera's folks to help us with this: guys, could you please help us to clarify the situation with the resource on your side? Who's controlling it/is in charge of the AWS? Please reach out to me privately - if needed - and I will communicate with these people directly. Contingent on the above: github.com/apache/bigtop is a mirror of ASF git repo. Could you please elaborate what's the role of the hook? smoke test jobs should be there. They might not be obviously named, as Roman Shaposhnik has an interesting opinion about naming things linux packages can not be published to Maven. Sorry does bigpetstore produces any java libraries that are used by projects that are downstream of Bigtop? if vagrant helps to speed up cluster spin-up - it'd be great! boxgrinder doesn't seem to be supported well. And looks like it doesn't support Ubuntu. I guess we'd better look for other options. I am not an expert on these sort of things, though. BTW, I think order of building packages and running smoke tests steps has to be swapped
          Hide
          Sean Mackrory added a comment -

          Pinging Cloudera's folks to help us with this: guys, could you please help us to clarify the situation with the resource on your side? Who's controlling it/is in charge of the AWS?

          Ping acknowledged, although I myself have never used or needed access, but I'll try find out who would make such a decision and get back to you. I think getting full access to the AWS console is not likely to happen, so if you do consider that a "MUST have", diversifying the donated infrastructure can't be bad for the project.

          Is a new Jenkins machine really needed? It seems to me we really just need a concerted effort to find out which jobs nobody cares about and delete them, and rename the rest so it's clear why they are needed.

          Show
          Sean Mackrory added a comment - Pinging Cloudera's folks to help us with this: guys, could you please help us to clarify the situation with the resource on your side? Who's controlling it/is in charge of the AWS? Ping acknowledged, although I myself have never used or needed access, but I'll try find out who would make such a decision and get back to you. I think getting full access to the AWS console is not likely to happen, so if you do consider that a "MUST have", diversifying the donated infrastructure can't be bad for the project. Is a new Jenkins machine really needed? It seems to me we really just need a concerted effort to find out which jobs nobody cares about and delete them, and rename the rest so it's clear why they are needed.
          Hide
          Konstantin Boudnik added a comment -

          access to the AWS console is not likely to happen, so if you do consider that a "MUST have", diversifying the donated infrastructure can't be bad for the project

          Perhaps, I am misusing the AWS terminology. What I meant to say is that we need to have a way of managing these instances somehow. Perhaps, it isn't a full access to the console but rather a user name that has access to only this group of instances?
          Also, I don't we need a new master for CI - the current one might be fine as well

          Show
          Konstantin Boudnik added a comment - access to the AWS console is not likely to happen, so if you do consider that a "MUST have", diversifying the donated infrastructure can't be bad for the project Perhaps, I am misusing the AWS terminology. What I meant to say is that we need to have a way of managing these instances somehow. Perhaps, it isn't a full access to the console but rather a user name that has access to only this group of instances? Also, I don't we need a new master for CI - the current one might be fine as well
          Hide
          Roman Shaposhnik added a comment -

          the CI is living on Cloudera's donated AWS VMs. In order to change the layout of the systems we need to have full access to the console. This step is the MUST have: if project PMC can't control the donated infrastructure of the project - we'll have to start looking for different options. Unfortunately. Pinging Cloudera's folks to help us with this: guys, could you please help us to clarify the situation with the resource on your side? Who's controlling it/is in charge of the AWS? Please reach out to me privately - if needed - and I will communicate with these people directly.

          As a PMC we do have access to an AWS credentials assigned to the Bigtop account that Cloudera is paying for (thanks for that by the way – this is really appreciated!). We've got an ability to issue AWS API calls but we can't login into the AWS management console (which I never felt was a problem anyway).

          Now, I think what would be a really useful discussion to have is how we should be utilizing these resources and whether having the current approach of running dedicated VMs for build is ideal or whether we should consolidate and use containers to quickly spin up build environments. At any rate, I'd appreciate if we could hash those out on ML, since JIRA could be a bit inconvenient for those types of open-ended discussions.

          Show
          Roman Shaposhnik added a comment - the CI is living on Cloudera's donated AWS VMs. In order to change the layout of the systems we need to have full access to the console. This step is the MUST have: if project PMC can't control the donated infrastructure of the project - we'll have to start looking for different options. Unfortunately. Pinging Cloudera's folks to help us with this: guys, could you please help us to clarify the situation with the resource on your side? Who's controlling it/is in charge of the AWS? Please reach out to me privately - if needed - and I will communicate with these people directly. As a PMC we do have access to an AWS credentials assigned to the Bigtop account that Cloudera is paying for (thanks for that by the way – this is really appreciated!). We've got an ability to issue AWS API calls but we can't login into the AWS management console (which I never felt was a problem anyway). Now, I think what would be a really useful discussion to have is how we should be utilizing these resources and whether having the current approach of running dedicated VMs for build is ideal or whether we should consolidate and use containers to quickly spin up build environments. At any rate, I'd appreciate if we could hash those out on ML, since JIRA could be a bit inconvenient for those types of open-ended discussions.
          Hide
          jay vyas added a comment -

          1) forget git repo hooks , we can poll .
          2) sure, the smokes may be there. we can copy the job over to a new server if we create one. but remember we're about to redo alot of the smokes very shortly, starting small with BIGTOP-1221 and maybe even more.
          3) did i say maven ? sorry, i meant s3
          4) yes, just a jar file. Right now its published here : bigpetstore.s3.amazonaws.com/maven/BigPetStorePro . We can move it to the org.apache.bigtop repo just like itest .
          5) the vagrant cluster tests will validate both the repo, the vagrant recipes, the puppet recipes, the smoke tests, AND the individual components, all in one shot. They will speed up release / dev /test cycle in that sense.
          6) packer is the way things are headed. but im no expert on vm builders either. i think Sean Mackrory or [~evans ye] might be able to help in that dept if any spare cycles. till then, i think replacing the ISOs with vagrant boxes is a pretty good step forward. ?

          Show
          jay vyas added a comment - 1) forget git repo hooks , we can poll . 2) sure, the smokes may be there. we can copy the job over to a new server if we create one. but remember we're about to redo alot of the smokes very shortly, starting small with BIGTOP-1221 and maybe even more. 3) did i say maven ? sorry, i meant s3 4) yes, just a jar file. Right now its published here : bigpetstore.s3.amazonaws.com/maven/BigPetStorePro . We can move it to the org.apache.bigtop repo just like itest . 5) the vagrant cluster tests will validate both the repo, the vagrant recipes, the puppet recipes, the smoke tests, AND the individual components, all in one shot. They will speed up release / dev /test cycle in that sense. 6) packer is the way things are headed. but im no expert on vm builders either. i think Sean Mackrory or [~evans ye] might be able to help in that dept if any spare cycles. till then, i think replacing the ISOs with vagrant boxes is a pretty good step forward. ?
          Hide
          Konstantin Boudnik added a comment -

          As a PMC we do have access to an AWS credentials assigned to the Bigtop account that Cloudera is paying for (thanks for that by the way – this is really appreciated!). We've got an ability to issue AWS API calls but we can't login into the AWS management console (which I never felt was a problem anyway).

          Good - that's all we need, as far as the control is concerned. Let's move this discussion to the ML - JIRA isn't a best place for things. Once we agree on the steps - we'll update or open new tickets.

          Show
          Konstantin Boudnik added a comment - As a PMC we do have access to an AWS credentials assigned to the Bigtop account that Cloudera is paying for (thanks for that by the way – this is really appreciated!). We've got an ability to issue AWS API calls but we can't login into the AWS management console (which I never felt was a problem anyway). Good - that's all we need, as far as the control is concerned. Let's move this discussion to the ML - JIRA isn't a best place for things. Once we agree on the steps - we'll update or open new tickets.
          Hide
          jay vyas added a comment -

          Dockerizing a build environment will allow us to remove some jenkins jobs and also to match exactly the build machines to the ones we develop on.

          Show
          jay vyas added a comment - Dockerizing a build environment will allow us to remove some jenkins jobs and also to match exactly the build machines to the ones we develop on.
          Hide
          jay vyas added a comment -

          These three JIRAs are all related to improving basic bigop infrastructure so im linking them

          Show
          jay vyas added a comment - These three JIRAs are all related to improving basic bigop infrastructure so im linking them
          Hide
          jay vyas added a comment - - edited

          Roman Shaposhnik This is a radical (maybe impossible) suggestion , but could we gut the jenkins projects and replace them with a 4 top level jenkins projects? I don't know if thats possible, but it sure would be awesome to see a simple mapping of bigtop source code to the bigtop jenkins server.

           - bigtop-repos
             - centos
                |_pig
                |_hive
                ...
          - bigtop-deploy
             vm
             |_ boxgrinder
                  ....  
          - bigtop-blueprints
             |_bigpetstore
                |_bps.jar
          

          Then the build output would be way easier to make sense of.

          Show
          jay vyas added a comment - - edited Roman Shaposhnik This is a radical (maybe impossible) suggestion , but could we gut the jenkins projects and replace them with a 4 top level jenkins projects? I don't know if thats possible, but it sure would be awesome to see a simple mapping of bigtop source code to the bigtop jenkins server. - bigtop-repos - centos |_pig |_hive ... - bigtop-deploy vm |_ boxgrinder .... - bigtop-blueprints |_bigpetstore |_bps.jar Then the build output would be way easier to make sense of.
          Hide
          Konstantin Boudnik added a comment -

          it might work. And if these needed to be interconnected then jobflow plugin can be used to create more complex flows.

          Show
          Konstantin Boudnik added a comment - it might work. And if these needed to be interconnected then jobflow plugin can be used to create more complex flows.
          Hide
          Roman Shaposhnik added a comment -

          jay vyas sorry, looks like I missed this JIRA. Could you please elaborate on what is it that you're proposing? 4 different top level Jenkins servers? (like bigtop0[1-4].cloudera.org type of thing?).

          Show
          Roman Shaposhnik added a comment - jay vyas sorry, looks like I missed this JIRA. Could you please elaborate on what is it that you're proposing? 4 different top level Jenkins servers? (like bigtop0 [1-4] .cloudera.org type of thing?).
          Hide
          jay vyas added a comment -

          simple : Im suggesting, that for every top level directory - we have one , and only one jenkins project .

          That way, when someone sees the bigtop build dashboard - they can easily map source code to the actual projects.

          Right now we have like 5 top level bigtop projects, and 500 jenkins projects. so its very hard to tell the artifacts.

          Show
          jay vyas added a comment - simple : Im suggesting, that for every top level directory - we have one , and only one jenkins project . That way, when someone sees the bigtop build dashboard - they can easily map source code to the actual projects. Right now we have like 5 top level bigtop projects, and 500 jenkins projects. so its very hard to tell the artifacts.
          Hide
          Roman Shaposhnik added a comment -

          There's quite a bit of jenkins projects right now (some of them need to be cleaned up, btw ) but the reason for that is to make our builds
          as asynchronous as possible. IOW, do you really want to rebuild all of Bigtop packages when just the Pig packaging changes? I can't quite figure out how would you achieve that level of independence if you condense everything into a single top level job corresponding to bigtop-packages.

          P.S. Perhaps, the easiest way to show what you really mean is to stan up a dummy instance in the cloud and start prototyping?

          Show
          Roman Shaposhnik added a comment - There's quite a bit of jenkins projects right now (some of them need to be cleaned up, btw ) but the reason for that is to make our builds as asynchronous as possible. IOW, do you really want to rebuild all of Bigtop packages when just the Pig packaging changes? I can't quite figure out how would you achieve that level of independence if you condense everything into a single top level job corresponding to bigtop-packages. P.S. Perhaps, the easiest way to show what you really mean is to stan up a dummy instance in the cloud and start prototyping?
          Hide
          jay vyas added a comment - - edited

          yup ! Im in the process of prototyping this actually, (which is why ive been silent on patches/reviews for a while - im redoing my bigtop testing infrastructure). i will let you folks know when i have something useful .

          and re: the pig question: Maybe bigtop-packaging can be a top level project with "smart" subprojects that are lazy. After all - moving to gradle should allow us that sort of flexibility

          Show
          jay vyas added a comment - - edited yup ! Im in the process of prototyping this actually, (which is why ive been silent on patches/reviews for a while - im redoing my bigtop testing infrastructure). i will let you folks know when i have something useful . and re: the pig question: Maybe bigtop-packaging can be a top level project with "smart" subprojects that are lazy. After all - moving to gradle should allow us that sort of flexibility
          Hide
          Roman Shaposhnik added a comment -

          jay vyas would love to take a look at the prototype and provide my comments!

          Show
          Roman Shaposhnik added a comment - jay vyas would love to take a look at the prototype and provide my comments!
          Hide
          jay vyas added a comment -

          update: I've dont nothing on this, with exception of a local prototype build-deploy system, been quite busy. should we put heads together, possibly make this #1 priority for 0.8.0 release ? ive put an email on the mailing list to that effect./

          Show
          jay vyas added a comment - update: I've dont nothing on this, with exception of a local prototype build-deploy system, been quite busy. should we put heads together, possibly make this #1 priority for 0.8.0 release ? ive put an email on the mailing list to that effect./
          Hide
          jay vyas added a comment -

          I think this is what an immutable CI for bigtop would look like, right? I could be oversimplifying, let me know.

          • build docker images for each distro
          • commit code to bigtop-1286 which pulls and builds bigtop on each distro
          • provision a single m1 xlarge
          • run the docker provisioners nightly on ^
          Show
          jay vyas added a comment - I think this is what an immutable CI for bigtop would look like, right? I could be oversimplifying, let me know. build docker images for each distro commit code to bigtop-1286 which pulls and builds bigtop on each distro provision a single m1 xlarge run the docker provisioners nightly on ^

            People

            • Assignee:
              Unassigned
              Reporter:
              jay vyas
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:

                Development