Whirr
  1. Whirr
  2. WHIRR-385

Implement support for using nodeless, masterless Puppet to provision and run scripts

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: new service
    • Labels:
      None

      Description

      As a user of Whirr, I'd like to be able to use puppet scripts (manifests, modules) from within Whirr to set up machines and clusters, because there are a lot of OS-neutral capabilities and a large number of actively maintained scripts which I could benefit from.

      1. WHIRR-385.patch
        31 kB
        Chad Metcalf
      2. WHIRR-385.patch
        51 kB
        Adrian Cole
      3. WHIRR-385.patch
        80 kB
        Alex Heneveld
      4. WHIRR-385.patch
        93 kB
        Alex Heneveld
      5. WHIRR-385.patch
        93 kB
        Alex Heneveld
      6. WHIRR-385.patch
        94 kB
        Adrian Cole
      7. WHIRR-385-results-aws-ec2.txt
        13 kB
        Adrian Cole
      8. WHIRR-385-results-cloudservers-us.txt
        12 kB
        Adrian Cole
      9. WHIRR-385.patch
        96 kB
        Adrian Cole
      10. WHIRR-385.patch
        96 kB
        Adrian Cole
      11. WHIRR-385.patch
        100 kB
        Adrian Cole
      12. WHIRR-385.patch
        111 kB
        Adrian Cole

        Issue Links

          Activity

          Hide
          Alex Heneveld added a comment -

          This issue is being split off from WHIRR-255 – which is focussed on with-master puppetry. It is expected work will follow the Chef-Solo approach being taken in WHIRR-49.

          Show
          Alex Heneveld added a comment - This issue is being split off from WHIRR-255 – which is focussed on with-master puppetry. It is expected work will follow the Chef-Solo approach being taken in WHIRR-49 .
          Hide
          Adrian Cole added a comment -
          Show
          Adrian Cole added a comment - here's an overview of nodeless masterless: http://semicomplete.com/presentations/puppet-at-loggly/puppet-at-loggly.pdf.html
          Hide
          Alex Heneveld added a comment -

          code is in progress at https://github.com/ahgittin/whirr/commits/WHIRR-385-puppet-masterless
          so far just installing puppet based on WHIRR-49; seems to work, resulting VMs have puppet installed
          (there are tests in the 385-...-WIP branch, which i'll merge in once they are working!)

          Show
          Alex Heneveld added a comment - code is in progress at https://github.com/ahgittin/whirr/commits/WHIRR-385-puppet-masterless so far just installing puppet based on WHIRR-49 ; seems to work, resulting VMs have puppet installed (there are tests in the 385-...-WIP branch, which i'll merge in once they are working!)
          Hide
          Chad Metcalf added a comment - - edited

          Alex, I'll put up a patch later today that I've been working on. I took a pretty direct path from the WHIRR-49 work and came up with three classes. Module, Resource, and Manifest. Unlike Chef there isn't always a single Puppet repository to rule them all. So Module allows a checkout of a given Puppet Module at a specific git url which goes into a directory on the MODULEPATH. You can instantiate whatever modules you need. Resource is a class to puppet apply an arbitrary Puppet resource. Resource is used by Manifest to apply a class in a module (includes support for parameterized classes).

          I think this approach is certainly a flavor of masterless. But it might be better called puppet for whirr as its the whirr way of doing things just using puppet under the covers instead of bash.

          A puppet masterless approach, like the one described in the puppet@loggly presentation is different. It wouldn't require a bunch of Statements and java. Rather given a ball of puppet modules and a default node definition it would include functionality based on facts and functions (including perhaps a whirr function like Jordan's has_role). At that point it would be generic. The whirr client instantiates puppet node(s) with a tarball/git url and the rest is done.

          Show
          Chad Metcalf added a comment - - edited Alex, I'll put up a patch later today that I've been working on. I took a pretty direct path from the WHIRR-49 work and came up with three classes. Module , Resource , and Manifest . Unlike Chef there isn't always a single Puppet repository to rule them all. So Module allows a checkout of a given Puppet Module at a specific git url which goes into a directory on the MODULEPATH. You can instantiate whatever modules you need. Resource is a class to puppet apply an arbitrary Puppet resource. Resource is used by Manifest to apply a class in a module (includes support for parameterized classes). I think this approach is certainly a flavor of masterless. But it might be better called puppet for whirr as its the whirr way of doing things just using puppet under the covers instead of bash. A puppet masterless approach, like the one described in the puppet@loggly presentation is different. It wouldn't require a bunch of Statements and java. Rather given a ball of puppet modules and a default node definition it would include functionality based on facts and functions (including perhaps a whirr function like Jordan's has_role ). At that point it would be generic. The whirr client instantiates puppet node(s) with a tarball/git url and the rest is done.
          Hide
          Chad Metcalf added a comment -

          Needs the cc patch in WHIRR-49. The patch has unit tests but needs an integration test.

          Show
          Chad Metcalf added a comment - Needs the cc patch in WHIRR-49 . The patch has unit tests but needs an integration test.
          Hide
          Adrian Cole added a comment -

          OK. here's the latest thoughts

          So, with Chad's patch, we have the right syntax to bootstrap (ex. install modules) and configure (apply manifests). We'll still need to work out firewall rules, but this is a great start.

          Alex and I deliberated for 90 minutes just now and have an idea of how this will flow:

          • user will specify the module or specific class they want applied, encoded into the role-name:
            ex. puppet:ntp+puppet:postgresql::server
          • user can specify properties that correspond to the module source:
            ex.
            puppet:ntp.module-dir=/path/to
            puppet:postgresql.module-dir=/path/to
          • user can can specify attribs by prefixing the attribs with the puppet manifest, where the first part (module or class) is stripped off
            ex.
            ntp.server=10.1.1.1
            postgresql::server.user=foo

          create a PuppetClusterActionHandler that does the following:
          – manifests = all services that are prefixed with puppet: and remove the puppet: prefix
          – modules = unique set of manifests trimming ::.*
          – beforeBootstrap install modules along with puppet + deps
          – beforeConfigure apply manifests w/ attribs parsed from properties prefixed by manifest name

          concerns:

          • if a manifest has a dot in it, it breaks syntax for grabbing attribs from properties
          • TODO firewall rules
          • TODO service start order

          nice to haves:

          • way to specify modules can come from classpath, file on disk, http, etc.
          • modules stored in source repository so that we can offline test

          Thoughts on the approach?

          Show
          Adrian Cole added a comment - OK. here's the latest thoughts So, with Chad's patch, we have the right syntax to bootstrap (ex. install modules) and configure (apply manifests). We'll still need to work out firewall rules, but this is a great start. Alex and I deliberated for 90 minutes just now and have an idea of how this will flow: user will specify the module or specific class they want applied, encoded into the role-name: ex. puppet:ntp+puppet:postgresql::server user can specify properties that correspond to the module source: ex. puppet:ntp.module-dir=/path/to puppet:postgresql.module-dir=/path/to user can can specify attribs by prefixing the attribs with the puppet manifest, where the first part (module or class) is stripped off ex. ntp.server=10.1.1.1 postgresql::server.user=foo create a PuppetClusterActionHandler that does the following: – manifests = all services that are prefixed with puppet: and remove the puppet: prefix – modules = unique set of manifests trimming ::.* – beforeBootstrap install modules along with puppet + deps – beforeConfigure apply manifests w/ attribs parsed from properties prefixed by manifest name concerns: if a manifest has a dot in it, it breaks syntax for grabbing attribs from properties TODO firewall rules TODO service start order nice to haves: way to specify modules can come from classpath, file on disk, http, etc. modules stored in source repository so that we can offline test Thoughts on the approach?
          Hide
          Chad Metcalf added a comment - - edited

          As it stands this functionality allows whirr developers to build services with the standard approach of composing statements. A few Modules and a Manifest or two and you've got a working service.

          In order to do this with configs I think you've mostly got something that will work.

          Getting the modules onto the box is a challenge. I limited it to the git clone case. If you are going to support other mechanisms you'll have to figure that out. It all has to go into a default location so puppet will pick it up as part of the default modulepath or you'll also have to configure puppet to expand its possible modulepath locations.

          Firewall rules I think you'd have to make its own properties. You aren't going to do it in puppet, you'd have to have a specific module or define and add that to all the modules. Thats a pain and limits the usefulness to whirr aware modules. Better to just have

          puppet:httpd.tcp-ports=80,443
          puppet:foobar.udp-ports=5300,6000-6200
          puppet:foobar.icmp=true
          

          As for service start order unless you make a single node definition puppet can't handle service ordering. The current implementation basically includes classes stand alone.

          If you want full puppet functionality you need to do real masterless puppet.

          In which case you've got a single git repo/tarball/whatever with all the modules you need with all the dependencies specified. And a single node definition which switches on roles.

          node default {
          
            if has_role("ntp") {
              include ntp
            } 
          
            if has_role("postgresql::server") {
              include postgresql::server  
            }
          }
          

          That way the postgresql::service service can specify that it depends on ntp... Or whatever the case may be. Note that this is much
          different then the approach outlined in the attached patch which mirrors the Chef functionality in WHIRR-49.

          In this case has_role would be a custom function that would read a file dropped by Whirr and decide if the node had the role or not.

          Possible /etc/whirr/roles
          ntp
          postgresql::server
          

          That would be easy and would be distributed and installed by the install_puppet.sh so that it would always be

          1. The right version
          2. Available
          Show
          Chad Metcalf added a comment - - edited As it stands this functionality allows whirr developers to build services with the standard approach of composing statements. A few Modules and a Manifest or two and you've got a working service. In order to do this with configs I think you've mostly got something that will work. Getting the modules onto the box is a challenge. I limited it to the git clone case. If you are going to support other mechanisms you'll have to figure that out. It all has to go into a default location so puppet will pick it up as part of the default modulepath or you'll also have to configure puppet to expand its possible modulepath locations. Firewall rules I think you'd have to make its own properties. You aren't going to do it in puppet, you'd have to have a specific module or define and add that to all the modules. Thats a pain and limits the usefulness to whirr aware modules. Better to just have puppet:httpd.tcp-ports=80,443 puppet:foobar.udp-ports=5300,6000-6200 puppet:foobar.icmp= true As for service start order unless you make a single node definition puppet can't handle service ordering. The current implementation basically includes classes stand alone. If you want full puppet functionality you need to do real masterless puppet. In which case you've got a single git repo/tarball/whatever with all the modules you need with all the dependencies specified. And a single node definition which switches on roles. node default { if has_role( "ntp" ) { include ntp } if has_role( "postgresql::server" ) { include postgresql::server } } That way the postgresql::service service can specify that it depends on ntp... Or whatever the case may be. Note that this is much different then the approach outlined in the attached patch which mirrors the Chef functionality in WHIRR-49 . In this case has_role would be a custom function that would read a file dropped by Whirr and decide if the node had the role or not. Possible /etc/whirr/roles ntp postgresql::server That would be easy and would be distributed and installed by the install_puppet.sh so that it would always be The right version Available
          Hide
          Garrett Honeycutt added a comment -

          With regards to Adrian's concern of "if a manifest has a dot in it, it breaks syntax for grabbing attribs from properties", this should not happen as periods are not allowed in the naming scheme, as per http://projects.puppetlabs.com/projects/puppet/wiki/Allowed_Characters_in_Identifier_Names

          Show
          Garrett Honeycutt added a comment - With regards to Adrian's concern of "if a manifest has a dot in it, it breaks syntax for grabbing attribs from properties", this should not happen as periods are not allowed in the naming scheme, as per http://projects.puppetlabs.com/projects/puppet/wiki/Allowed_Characters_in_Identifier_Names
          Hide
          Adrian Cole added a comment -

          if we use the firewall rule syntax/rule from WHIRR-371, we get around the need to replicate logic here.

          Show
          Adrian Cole added a comment - if we use the firewall rule syntax/rule from WHIRR-371 , we get around the need to replicate logic here.
          Hide
          Alex Heneveld added a comment -

          Summary

          To recap, and make sure that I'm clear and people are happy with the proposal, we are looking for something that supports:

          (1) The Simple Case. We want to be able to define, easily in Whirr, what we want Puppet to actually do. We've taken to calling these "subroles" and denoting them with the syntax puppet:subrole in the Whirr config properties (if there is a better word or syntax, please suggest). These map on to a specific Puppet class or manifest which we want to be able to specify in the config properties. (And crucially, they don't require a developer to define a new ClusterActionHandler.)

          In the simple case, say running the an ntp module init.pp manifest, we could just say:

          whirr.instance-templates=1 puppet:ntp
          

          (2) Manifests and Attributes. Often it is a specific manifest inside a module we wish to run, and occasionally a manifest not in a module, and often there are attributes we wish to define. To do this, the proposal is to require the subrole to be defined in the config properties using the syntax puppet:subrole=module. For example, the following defines an NTP server and client on a single machine (leaving aside the question of whether this is a useful configuration; it's a useful illustration).

          puppet:LocalNtpServer=ntp::server
          LocalNtpServer.ntp_sources="0.pool.ntp.org 1.pool.ntp.org"
          puppet:ntp=ntp
          ntp.ntpServerList=127.0.0.1
          whirr.instance-templates=1 puppet:LocalNtpServer+puppet:ntp
          

          (The suggestion is that it is required to define puppet:ntp=ntp even though it seems redundant, so that the parser can know that ntp is a defined puppet subrole (without depending on the instance-templates) and will be able to list it e.g. in the CLI tool.)

          (3) Module Paths. The final big question is where should the puppet code come from. A suggestion is that by default, it comes from a property, say puppet.modulepath which defaults to /etc/puppetlabs/puppet/module, and can be overridden. However it may be further desirable to define per-subrole paths. Suggested syntax for this is:

          puppet:LocalNtpServer.modulepath=/path/to/modules:/another/path/to/modules
          
          • Can we assume that for a given module we need only copy that module, or should we copy all modules at a given path (i.e. siblings of the intended module)?
          • Should we support declaring a path to a specific module – e.g. /path/to/modules/ntp (if we have to copy all modules at a path this could be a significant time-saver; and I don't think we want to go down the route of ant file selectors!) ?
          • Presumably URLs would be nice, e.g. http:// as well as file://, and possibly git:// and/or classpath:// (or whirr:// for classpath) – thoughts? If so, would we make ; the path separator char (ouch)? Or allow xxx.modulepath.somepart and xxx.modulepath.another and stitch these together? Or some other strategy?

          High-Level Questions

          Does this sound like the right approach?

          Is there useful functionality that we are ruling out with this approach and syntax? (For example, are we binding too tightly to modules? How would we point to a site.pp file for example?)

          Specific Comments (to previous comments)

          My hunch is that the node-name approach is less relevant in Whirr, since it's typically creating new machines for which hostname/DNS is often ignored. But it would be an option to have a convenience in Whirr for specifying the hostname then using a site.pp/node-centric approach. Thoughts on value & priority of this?

          The has_roles approach is an interesting alternative. Is it more in keeping with what whirr- and puppet- users would expect, and simpler for them? Or is it too constraining?

          Re firewalls, it seems to me it is okay to expect they are either specified in your puppet code, or in your whirr config (a la WHIRR-371), and so there isn't a need for a general syntax for specifying puppet firewall rules inside whirr as part of this task. (If the puppet code looks at attributes to set up firewalls, you could of course still specify those specific attributes.) Is this sufficient? Is there any worry that we'd get painful conflicts where both try to configure them?

          Show
          Alex Heneveld added a comment - Summary To recap, and make sure that I'm clear and people are happy with the proposal, we are looking for something that supports: (1) The Simple Case . We want to be able to define, easily in Whirr, what we want Puppet to actually do. We've taken to calling these "subroles" and denoting them with the syntax puppet:subrole in the Whirr config properties (if there is a better word or syntax, please suggest). These map on to a specific Puppet class or manifest which we want to be able to specify in the config properties. (And crucially, they don't require a developer to define a new ClusterActionHandler .) In the simple case, say running the an ntp module init.pp manifest, we could just say: whirr.instance-templates=1 puppet:ntp (2) Manifests and Attributes . Often it is a specific manifest inside a module we wish to run, and occasionally a manifest not in a module, and often there are attributes we wish to define. To do this, the proposal is to require the subrole to be defined in the config properties using the syntax puppet:subrole=module . For example, the following defines an NTP server and client on a single machine (leaving aside the question of whether this is a useful configuration; it's a useful illustration). puppet:LocalNtpServer=ntp::server LocalNtpServer.ntp_sources="0.pool.ntp.org 1.pool.ntp.org" puppet:ntp=ntp ntp.ntpServerList=127.0.0.1 whirr.instance-templates=1 puppet:LocalNtpServer+puppet:ntp (The suggestion is that it is required to define puppet:ntp=ntp even though it seems redundant, so that the parser can know that ntp is a defined puppet subrole (without depending on the instance-templates ) and will be able to list it e.g. in the CLI tool.) (3) Module Paths . The final big question is where should the puppet code come from. A suggestion is that by default, it comes from a property, say puppet.modulepath which defaults to /etc/puppetlabs/puppet/module , and can be overridden. However it may be further desirable to define per-subrole paths. Suggested syntax for this is: puppet:LocalNtpServer.modulepath=/path/to/modules:/another/path/to/modules Can we assume that for a given module we need only copy that module, or should we copy all modules at a given path (i.e. siblings of the intended module)? Should we support declaring a path to a specific module – e.g. /path/to/modules/ntp (if we have to copy all modules at a path this could be a significant time-saver; and I don't think we want to go down the route of ant file selectors!) ? Presumably URLs would be nice, e.g. http:// as well as file:// , and possibly git:// and/or classpath:// (or whirr:// for classpath) – thoughts? If so, would we make ; the path separator char (ouch)? Or allow xxx.modulepath.somepart and xxx.modulepath.another and stitch these together? Or some other strategy? High-Level Questions Does this sound like the right approach? Is there useful functionality that we are ruling out with this approach and syntax? (For example, are we binding too tightly to modules? How would we point to a site.pp file for example?) Specific Comments (to previous comments) My hunch is that the node-name approach is less relevant in Whirr, since it's typically creating new machines for which hostname/DNS is often ignored. But it would be an option to have a convenience in Whirr for specifying the hostname then using a site.pp/node-centric approach. Thoughts on value & priority of this? The has_roles approach is an interesting alternative. Is it more in keeping with what whirr- and puppet- users would expect, and simpler for them? Or is it too constraining? Re firewalls, it seems to me it is okay to expect they are either specified in your puppet code, or in your whirr config (a la WHIRR-371 ), and so there isn't a need for a general syntax for specifying puppet firewall rules inside whirr as part of this task. (If the puppet code looks at attributes to set up firewalls, you could of course still specify those specific attributes.) Is this sufficient? Is there any worry that we'd get painful conflicts where both try to configure them?
          Hide
          Adrian Cole added a comment -

          -1 on section (2) Manifests and Attributes.

          As it is, we can have ntp::server and ntp::client properties added with no conflicts using straight up commons properties syntax based on notes from 14/Sep/11 01:17, this is the same way hadoop properties work today. This was further confirmed by garret's comment that there is no case where trimming the module or class from the properties key will conflict (bc they cannot have dots in them)

          Currently, there's no requirement that needs symbolic role names and property key indirection outside of being able to enumerate available puppet roles from the cli. Even if we did use this approach, it requires manual configuration in order to set it up the mappings. This is something I perceive as unnecessary complexity, and a case where the cure is worse than the problem.

          If we wish to support multiple instances of the same manifest (ex. ntp::server) with separate properties, let's wait for someone to have this requirement, as it would be a general case (ex. 1 zookeepers but with separate properties bundles), and it would be its own top-level jira.

          -1 on section (3) Module Paths:

          I don't see the reason to add multiple module paths per module at this point. Starting simple and adding advanced functionality when asked is a way to keep our code and complexity from getting out of control. Like above, there's nothing in this issue that requires url-handling per artifact (in this case resource containing our module). If we had this requirement, it would be something all other services would use and need its own top-level jira.

          What we need to do is have a mechanism to push a module path physically to the host, and at beforeBootstrap reconfigure puppet's module paths default to include them. Currently, we have mechanisms used in whirr to push tarballs and Chad has mentioned a git option (which he presumably uses). Let's investigate re-using these approaches before adding more tangental code to the jira.

          WRT: Chad's comment mentioning we need to setup a switch statement in order to implement nodeless masterless:

          +1 looks like the default node implementation detail is easy for us to build at during beforeBootstrap from the puppet manifests passed in from the user (ex. Statements.appendFile(path, somethingThatBuildsThis). I'm not sure where the best place to put this file.

          Show
          Adrian Cole added a comment - -1 on section (2) Manifests and Attributes. As it is, we can have ntp::server and ntp::client properties added with no conflicts using straight up commons properties syntax based on notes from 14/Sep/11 01:17, this is the same way hadoop properties work today. This was further confirmed by garret's comment that there is no case where trimming the module or class from the properties key will conflict (bc they cannot have dots in them) Currently, there's no requirement that needs symbolic role names and property key indirection outside of being able to enumerate available puppet roles from the cli. Even if we did use this approach, it requires manual configuration in order to set it up the mappings. This is something I perceive as unnecessary complexity, and a case where the cure is worse than the problem. If we wish to support multiple instances of the same manifest (ex. ntp::server) with separate properties, let's wait for someone to have this requirement, as it would be a general case (ex. 1 zookeepers but with separate properties bundles), and it would be its own top-level jira. -1 on section (3) Module Paths: I don't see the reason to add multiple module paths per module at this point. Starting simple and adding advanced functionality when asked is a way to keep our code and complexity from getting out of control. Like above, there's nothing in this issue that requires url-handling per artifact (in this case resource containing our module). If we had this requirement, it would be something all other services would use and need its own top-level jira. What we need to do is have a mechanism to push a module path physically to the host, and at beforeBootstrap reconfigure puppet's module paths default to include them. Currently, we have mechanisms used in whirr to push tarballs and Chad has mentioned a git option (which he presumably uses). Let's investigate re-using these approaches before adding more tangental code to the jira. WRT: Chad's comment mentioning we need to setup a switch statement in order to implement nodeless masterless: +1 looks like the default node implementation detail is easy for us to build at during beforeBootstrap from the puppet manifests passed in from the user (ex. Statements.appendFile(path, somethingThatBuildsThis). I'm not sure where the best place to put this file.
          Hide
          Chad Metcalf added a comment -

          If you are building a node definition to apply after all the requested puppet resources have been identified I'd put it in /etc/puppet/manifests/site.pp. So to be clear the user specifies:

          puppet:ntp+puppet:postgresql::server
          ntp.server=10.1.1.1
          postgresql::server.user=foo
          puppet:ntp.module-dir=/path/to/ntp.tgz
          puppet:postgresql.module-dir=git@github.com/someone/puppet-postgresql.git
          

          Whirr then copies up the ntp.tgz with whatever pre-existing mechanism and drops it into /etc/puppet/modules/ntp. Whirr instantiates a Module class which does a git clone of the postgres module and drops it into /etc/puppet/modules/postgres.

          Whirr then generates the site.pp which is going to look like this:

          node default {
            class { 'ntp':
              server => '10.1.1.1',
            }
            class { 'postgres::server':
              user => 'foo',
            }   
          }
          

          We don't need the switch statement here since the user is specifying everything.

          Now that the site.pp is on local disk, the modules are in the modulepath, and all is right with the world. The system can puppet apply /etc/puppet/manifests/site.pp and thats it.

          Show
          Chad Metcalf added a comment - If you are building a node definition to apply after all the requested puppet resources have been identified I'd put it in /etc/puppet/manifests/site.pp . So to be clear the user specifies: puppet:ntp+puppet:postgresql::server ntp.server=10.1.1.1 postgresql::server.user=foo puppet:ntp.module-dir=/path/to/ntp.tgz puppet:postgresql.module-dir=git@github.com/someone/puppet-postgresql.git Whirr then copies up the ntp.tgz with whatever pre-existing mechanism and drops it into /etc/puppet/modules/ntp . Whirr instantiates a Module class which does a git clone of the postgres module and drops it into /etc/puppet/modules/postgres . Whirr then generates the site.pp which is going to look like this: node default { class { 'ntp': server => '10.1.1.1', } class { 'postgres::server': user => 'foo', } } We don't need the switch statement here since the user is specifying everything. Now that the site.pp is on local disk, the modules are in the modulepath, and all is right with the world. The system can puppet apply /etc/puppet/manifests/site.pp and thats it.
          Hide
          Alex Heneveld added a comment -

          Re (2) I'm cool with skipping the symbolic role names if we don't need to list or set up differently-properties instances of the same service. They can always be added in without an API change. Relevant example config then looks like:

          ntp::server.ntp_sources="0.pool.ntp.org 1.pool.ntp.org"
          ntp.ntpServerList=127.0.0.1
          whirr.instance-templates=1 puppet:ntp::server+puppet:ntp
          

          I've started looking at supporting the : syntax inside roles; more on this to follow.

          Re (3) again I'm all for simplifying so long as result remains useful (and can be extended). One directory or git, not multiple paths until we need (and we'll deal with the multiple properties or path-sep char/syntax then).

          roles-based node default switch: I like this, the more I understand it. Any objections to going this route? I presume the node def switch statement just lives in manifests/site.pp. I will have a look at how to pass in roles / read them from a file.

          If we go this route would it be reasonable to have a single puppet.modulepath property from the config? Or do you need to be able to specify a different path per-module?

          Show
          Alex Heneveld added a comment - Re (2) I'm cool with skipping the symbolic role names if we don't need to list or set up differently-properties instances of the same service. They can always be added in without an API change. Relevant example config then looks like: ntp::server.ntp_sources="0.pool.ntp.org 1.pool.ntp.org" ntp.ntpServerList=127.0.0.1 whirr.instance-templates=1 puppet:ntp::server+puppet:ntp I've started looking at supporting the : syntax inside roles; more on this to follow. Re (3) again I'm all for simplifying so long as result remains useful (and can be extended). One directory or git, not multiple paths until we need (and we'll deal with the multiple properties or path-sep char/syntax then). roles-based node default switch : I like this, the more I understand it. Any objections to going this route? I presume the node def switch statement just lives in manifests/site.pp . I will have a look at how to pass in roles / read them from a file. If we go this route would it be reasonable to have a single puppet.modulepath property from the config? Or do you need to be able to specify a different path per-module?
          Hide
          Alex Heneveld added a comment -

          Mid-air collision. +1 to Chad's approach (custom site.pp and per-module path).

          Show
          Alex Heneveld added a comment - Mid-air collision. +1 to Chad's approach (custom site.pp and per-module path).
          Hide
          Adrian Cole added a comment -
          Show
          Adrian Cole added a comment - patch including Chad and Alex's work https://github.com/ahgittin/whirr/commits/WHIRR-385-colon-subroles
          Hide
          Adrian Cole added a comment -

          Latest patch takes in work from Chad and Alex, and refactors it a bit. Here are the highlights:

          1. new type: ClusterActionHandlerFactory this creates instances of ClusterActionHandler by role name. this is registered service loader.
          – this gets rid of hacks otherwise needed to pass the current role being processed during ClusterActions

          2. PuppetClusterActionHandlerFactory makes PuppetClusterActionHandler instances for roles prefixed by puppet: stripping the prefix.

          This should be a good foundation to complete the rest of this jira.

          Show
          Adrian Cole added a comment - Latest patch takes in work from Chad and Alex, and refactors it a bit. Here are the highlights: 1. new type: ClusterActionHandlerFactory this creates instances of ClusterActionHandler by role name. this is registered service loader. – this gets rid of hacks otherwise needed to pass the current role being processed during ClusterActions 2. PuppetClusterActionHandlerFactory makes PuppetClusterActionHandler instances for roles prefixed by puppet: stripping the prefix. This should be a good foundation to complete the rest of this jira.
          Hide
          Alex Heneveld added a comment -

          Slight hitch re syntax. Apache Configuration (like Java properties) don't like colons in the keys, unless they are backslashed. It interprets them as = i.e. postgresql::server.user=foo sets key postgresql equal to :server.user=foo .

          Some options for alternatives:

          (1)  postgresql\:\:server.user=foo         # escape them
          
          (2)  postgresql..server.user=foo           # replace all : with .
          
          (3)  postgresql.user=foo                   # attributes are per-module only, not per-manifest
          
          (4)  puppet.mypostgres=postgresql::server  # symbolic role names
          (4)  mypostgres.user=foo
          
          (5)  postgresql::server.user=foo           # stick with this syntax, using a custom parser
          

          What's nicest?

          Show
          Alex Heneveld added a comment - Slight hitch re syntax. Apache Configuration (like Java properties) don't like colons in the keys, unless they are backslashed. It interprets them as = i.e. postgresql::server.user=foo sets key postgresql equal to :server.user=foo . Some options for alternatives: (1) postgresql\:\:server.user=foo # escape them (2) postgresql..server.user=foo # replace all : with . (3) postgresql.user=foo # attributes are per-module only, not per-manifest (4) puppet.mypostgres=postgresql::server # symbolic role names (4) mypostgres.user=foo (5) postgresql::server.user=foo # stick with this syntax, using a custom parser What's nicest?
          Hide
          Adrian Cole added a comment -

          Rats.. thanks for the summary, Alex.

          I guess there's also this, right?
          (6) postgresql.server.user=foo # don't use double :: or .. syntax and normalize to dots

          I'm more in favor of this, if it can work or option 5

          Show
          Adrian Cole added a comment - Rats.. thanks for the summary, Alex. I guess there's also this, right? (6) postgresql.server.user=foo # don't use double :: or .. syntax and normalize to dots I'm more in favor of this, if it can work or option 5
          Hide
          Garrett Honeycutt added a comment -

          Option 6 looks good. Wanted to warn about option (3), as a module will often have multiple classes, such as postgresql, postgresql::server, postgresql::client, etc. You would want to keep the ability to have data scoped per class and not shared amongst them all.

          Show
          Garrett Honeycutt added a comment - Option 6 looks good. Wanted to warn about option (3), as a module will often have multiple classes, such as postgresql, postgresql::server, postgresql::client, etc. You would want to keep the ability to have data scoped per class and not shared amongst them all.
          Hide
          Chad Metcalf added a comment - - edited

          Another snag though easier to fix is that currently we don't support arrays for these. For example using the ntp module...

          ntp.servers=10.0.0.1
          

          Works expanding to:

          class {'ntp':
            servers => '10.0.0.1',
          }
          

          But there is no way to give an array.

          ntp.servers=[10.0.0.1, 10.0.0.2]
          #or
          ntp.servers=10.0.0.1, 10.0.0.2
          

          Expands to:

          class {'ntp':
            servers => '[10.0.0.1, 10.0.0.2]',
          }
          
          #or
          
          class {'ntp':
            servers => '10.0.0.1, 10.0.0.2',
          }
          

          We probably should either:

          1. take csv and convert to ["foo", "bar", "baz"] though it makes single element arrays annoying
          2. do no quoting and the user specifies the exact string to be used

          I'm for 2 since its easier, more general and probably less error prone.

          Show
          Chad Metcalf added a comment - - edited Another snag though easier to fix is that currently we don't support arrays for these. For example using the ntp module... ntp.servers=10.0.0.1 Works expanding to: class {'ntp': servers => '10.0.0.1', } But there is no way to give an array. ntp.servers=[10.0.0.1, 10.0.0.2] #or ntp.servers=10.0.0.1, 10.0.0.2 Expands to: class {'ntp': servers => '[10.0.0.1, 10.0.0.2]', } #or class {'ntp': servers => '10.0.0.1, 10.0.0.2', } We probably should either: take csv and convert to ["foo", "bar", "baz"] though it makes single element arrays annoying do no quoting and the user specifies the exact string to be used I'm for 2 since its easier, more general and probably less error prone.
          Hide
          Alex Heneveld added a comment -

          Thanks guys. Going with (6) module.manifestpart.manifestpart.manifestfilename.attribute=value.

          Re values, +1 for (2) no quoting. This supports not only arrays but also booleans and tokens like running, present, and potentially resources like File['/etc/someconfig']. (Suspect people will forget to quote their strings though... any way to warn them short of the puppet stderr in the logs?)

          Show
          Alex Heneveld added a comment - Thanks guys. Going with (6) module.manifestpart.manifestpart.manifestfilename.attribute=value . Re values, +1 for (2) no quoting. This supports not only arrays but also booleans and tokens like running , present , and potentially resources like File ['/etc/someconfig'] . (Suspect people will forget to quote their strings though... any way to warn them short of the puppet stderr in the logs?)
          Hide
          Chad Metcalf added a comment - - edited

          I don't think so without validation. Like you pointed out:

          foo => '10.0.0.1',
          bar => ['a', 'b', 'c'],
          ensure => present,
          require => File['/etc/motd'],
          

          All very different. Lets just put nice scary brightly colored large font warnings in the docs.

          Also does whirr support local vms? If so spinning up a virtualbox test environment would be nice.

          Show
          Chad Metcalf added a comment - - edited I don't think so without validation. Like you pointed out: foo => '10.0.0.1', bar => ['a', 'b', 'c'], ensure => present, require => File['/etc/motd'], All very different. Lets just put nice scary brightly colored large font warnings in the docs. Also does whirr support local vms? If so spinning up a virtualbox test environment would be nice.
          Hide
          Alex Heneveld added a comment -

          Thanks. Attached is a mostly-working rough cut. The patch supports the syntax discussed above, it installs puppet, grabs module code from git as per config, and applies a call to the manifest class with the attributes set from config. It has an integration test for nginx showing how things work. However--

          • Suspect we do need module paths for this to be useful, or some magic dependency resolution. nginx depends on the stdlib module (which I cheated on and make installed as part of install_puppet.sh but that's not a scalable solution). Is this a priority?
          • nginx looks like it isn't happy on the AWS linux boxes. I hit the following error which I think isn't to do with this issue but might be to do with nginx. Or maybe nginx is just a bad candidate for AWS. The error is: err: /Stage[main]/Nginx::Service/Exec[rebuild-nginx-vhosts]: Failed to call refresh: /bin/cat /tmp/nginx.d/* > /etc/nginx/conf.d/vhost_autogen.conf returned 1 instead of one of [0] at /etc/puppet/modules/nginx/manifests/service.pp:21

          I'll maybe try ntp instead. Apart from the above, next steps are:

          • transferring local directories for the modulepath (in addition to git)
          • creating and running a single site.pp for all subroles (instead of single apply for each subrole; this requires detecting when we've done all the subroles which is a bit fun..)

          Re vbox – don't think it's available quite yet, but possibly a BYON config is more efficient. i'm just using AWS which is okay; not great, but a lot faster since switching to your install_puppet script with the --no-rdoc --no-ri J

          Show
          Alex Heneveld added a comment - Thanks. Attached is a mostly-working rough cut. The patch supports the syntax discussed above, it installs puppet, grabs module code from git as per config, and applies a call to the manifest class with the attributes set from config. It has an integration test for nginx showing how things work. However-- Suspect we do need module paths for this to be useful, or some magic dependency resolution. nginx depends on the stdlib module (which I cheated on and make installed as part of install_puppet.sh but that's not a scalable solution). Is this a priority? nginx looks like it isn't happy on the AWS linux boxes. I hit the following error which I think isn't to do with this issue but might be to do with nginx. Or maybe nginx is just a bad candidate for AWS. The error is: err: /Stage [main] /Nginx::Service/Exec [rebuild-nginx-vhosts] : Failed to call refresh: /bin/cat /tmp/nginx.d/* > /etc/nginx/conf.d/vhost_autogen.conf returned 1 instead of one of [0] at /etc/puppet/modules/nginx/manifests/service.pp:21 I'll maybe try ntp instead. Apart from the above, next steps are: transferring local directories for the modulepath (in addition to git) creating and running a single site.pp for all subroles (instead of single apply for each subrole; this requires detecting when we've done all the subroles which is a bit fun..) Re vbox – don't think it's available quite yet, but possibly a BYON config is more efficient. i'm just using AWS which is okay; not great, but a lot faster since switching to your install_puppet script with the --no-rdoc --no-ri J
          Hide
          Chad Metcalf added a comment -

          So the reason I decoupled Module from Manifest was to allow multiple modules to be cloned to satisfy dependencies. Which would allow:

          Module mod_stdlib = new Module('stdlib', 'git@github.com/somewhere/puppet-stdlib.git')
          Module mod_nginx = new Module('nginx', 'git@github.com/somewhere/puppet-nginx.git')
          Manifest nginx = new Manifest('nginx')
          

          My suggestion would be to run through the properties collecting a list of roles and a set of modules. Once you've parsed the properties file you've got the list of roles and the set of unique modules to checkout. So something like:

          puppet:ntp.module-dir=git@github.com/someone/puppet-ntp.git
          puppet:nginx.module-dir=git@github.com/someone/puppet-stdlib.git,git@github.com/someone/puppet-nginx.git
          puppet:foobar.module-dir=git@github.com/someone/puppet-stdlib.git,git@github.com/someone/puppet-foobar.git
          

          Will result in four modules getting cloned. Or better yet just add:

          puppet:ntp.module-dir=git@github.com/someone/puppet-ntp.git
          puppet:nginx.module-dir=git@github.com/someone/puppet-nginx.git
          puppet:stdlib.module-dir=git@github.com/someone/puppet-stdlib.git
          puppet:foobar.module-dir=git@github.com/someone/puppet-foobar.git
          

          module-dir maps to Module anyway. If Module expands to be able to deal with tarballs then you don't have to do anything special. Roles map to Manifest and module-dir map to Modules.

          RE: nginx you're probably right.

          Show
          Chad Metcalf added a comment - So the reason I decoupled Module from Manifest was to allow multiple modules to be cloned to satisfy dependencies. Which would allow: Module mod_stdlib = new Module('stdlib', 'git@github.com/somewhere/puppet-stdlib.git') Module mod_nginx = new Module('nginx', 'git@github.com/somewhere/puppet-nginx.git') Manifest nginx = new Manifest('nginx') My suggestion would be to run through the properties collecting a list of roles and a set of modules. Once you've parsed the properties file you've got the list of roles and the set of unique modules to checkout. So something like: puppet:ntp.module-dir=git@github.com/someone/puppet-ntp.git puppet:nginx.module-dir=git@github.com/someone/puppet-stdlib.git,git@github.com/someone/puppet-nginx.git puppet:foobar.module-dir=git@github.com/someone/puppet-stdlib.git,git@github.com/someone/puppet-foobar.git Will result in four modules getting cloned. Or better yet just add: puppet:ntp.module-dir=git@github.com/someone/puppet-ntp.git puppet:nginx.module-dir=git@github.com/someone/puppet-nginx.git puppet:stdlib.module-dir=git@github.com/someone/puppet-stdlib.git puppet:foobar.module-dir=git@github.com/someone/puppet-foobar.git module-dir maps to Module anyway. If Module expands to be able to deal with tarballs then you don't have to do anything special. Roles map to Manifest and module-dir map to Modules . RE: nginx you're probably right.
          Hide
          Alex Heneveld added a comment -

          So you're suggesting install all the modules for which properties are defined onto every machine, not trying to be more clever about installing just the necessary modules for the roles (manifests) at particular machines? That sounds good.

          It is easy to support comma-separated syntax – but I think there is a problem with it:

          puppet:nginx.module-dir=git@github.com/someone/puppet-stdlib.git,git@github.com/someone/puppet-nginx.git
          whirr.instance-templates=1 puppet:nginx
          

          Because we don't know which subdir of modules to put the code in (i.e. how to know that the puppet-stdlib.git needs to go in a directory stdlib; maybe we could to inspect the downloaded code to find that out but seems easier to disallow comma-separated paths). So I'll go for installing all the modules defined with puppet:*.module-dir. If that's too wasteful then let's attack the multi-entry path problem. IOW:

          puppet:nginx.module-dir=git@github.com/someone/puppet-nginx.git
          puppet:stdlib.module-dir=git@github.com/someone/puppet-stdlib.git
          whirr.instance-templates=1 puppet:nginx
          

          (Re nginx, the root cause was a missing puppet group. Adding useradd puppet to the puppet install script makes it all happy.)

          Show
          Alex Heneveld added a comment - So you're suggesting install all the modules for which properties are defined onto every machine, not trying to be more clever about installing just the necessary modules for the roles (manifests) at particular machines? That sounds good. It is easy to support comma-separated syntax – but I think there is a problem with it: puppet:nginx.module-dir=git@github.com/someone/puppet-stdlib.git,git@github.com/someone/puppet-nginx.git whirr.instance-templates=1 puppet:nginx Because we don't know which subdir of modules to put the code in (i.e. how to know that the puppet-stdlib.git needs to go in a directory stdlib ; maybe we could to inspect the downloaded code to find that out but seems easier to disallow comma-separated paths). So I'll go for installing all the modules defined with puppet:*.module-dir . If that's too wasteful then let's attack the multi-entry path problem. IOW: puppet:nginx.module-dir=git@github.com/someone/puppet-nginx.git puppet:stdlib.module-dir=git@github.com/someone/puppet-stdlib.git whirr.instance-templates=1 puppet:nginx (Re nginx, the root cause was a missing puppet group. Adding useradd puppet to the puppet install script makes it all happy.)
          Hide
          Alex Heneveld added a comment -

          Attaching latest patch – this fixes the remaining issues mentioned above. Specifically:

          • you can specify module-dir as gits, or as tgz files or directories on local file system (these latter ones are shipped to destinations as blobs)
          • all properties puppet.*.module-dirs are installed on all machines which have any puppet role, on bootstrap for first such role
          • the manifests are collected into a site.pp which is run as part of the configure of the last puppet role

          (The behaviour of the last two items can be reverted to previous description with runStrategy flags in the PuppetClusterActionHandler.)

          Show
          Alex Heneveld added a comment - Attaching latest patch – this fixes the remaining issues mentioned above. Specifically: you can specify module-dir as gits, or as tgz files or directories on local file system (these latter ones are shipped to destinations as blobs) all properties puppet.*.module-dirs are installed on all machines which have any puppet role, on bootstrap for first such role the manifests are collected into a site.pp which is run as part of the configure of the last puppet role (The behaviour of the last two items can be reverted to previous description with runStrategy flags in the PuppetClusterActionHandler .)
          Hide
          Alex Heneveld added a comment -

          Previous patch was malformed (missed out --noprefix); this one fixes it.

          Show
          Alex Heneveld added a comment - Previous patch was malformed (missed out --noprefix); this one fixes it.
          Hide
          Adrian Cole added a comment -

          new patch that employs firewall rules in nginx and tests the web server came up

          Show
          Adrian Cole added a comment - new patch that employs firewall rules in nginx and tests the web server came up
          Hide
          Adrian Cole added a comment -

          OK. I added a new patch that tests to ensure that the nginx web server came up by specifying firewall rules.
          tested on aws-ec2 and now testing on cloudservers-us

          next steps:
          add a recipe for the cli to install nginx via puppet

          Show
          Adrian Cole added a comment - OK. I added a new patch that tests to ensure that the nginx web server came up by specifying firewall rules. tested on aws-ec2 and now testing on cloudservers-us next steps: add a recipe for the cli to install nginx via puppet
          Hide
          Adrian Cole added a comment -

          test results passing on aws-ec2 and cloudservers-us

          Show
          Adrian Cole added a comment - test results passing on aws-ec2 and cloudservers-us
          Hide
          Adrian Cole added a comment -

          revised to include recipe

          Show
          Adrian Cole added a comment - revised to include recipe
          Hide
          Adrian Cole added a comment -

          I can get the patch to apply, and start via cli. Socket is open, but curl seems to be hanging. Can someone who knows puppet try and figure out what might be wrong with the setup? Maybe we are missing attribs.

          1. build with patch
            patch -p0 --input /path/to/WHIRR-385.patch
            mvn clean install -Dmaven.javadoc.skip=true eclipse:eclipse
            mvn package assembly:assembly -Dmaven.javadoc.skip=true
          1. setup ec2 creds
            export AWS_ACCESS_KEY_ID=...
            export AWS_SECRET_ACCESS_KEY=...
          1. run cluster
            bin/whirr launch-cluster --config recipes/puppet-nginx-ec2.properties
          1. test cluster
            for I in `awk ' {print $3}

            ' ~/.whirr/nginx/instances`;do nc $I 80 && echo socket $I 80 ok|| echo socket $I 80 timeout;curl --connect-timeout 1 --max-time 1 --silent http://$I >/dev/null 2>&1&& echo http://$I ok|| echo http://$I timeout;done

          1. shutdown cluster
            bin/whirr launch-cluster --config recipes/puppet-nginx-ec2.properties

          Note that when the cluster is running, you can ssh into the instance as your default user to look around

          Show
          Adrian Cole added a comment - I can get the patch to apply, and start via cli. Socket is open, but curl seems to be hanging. Can someone who knows puppet try and figure out what might be wrong with the setup? Maybe we are missing attribs. build with patch patch -p0 --input /path/to/ WHIRR-385 .patch mvn clean install -Dmaven.javadoc.skip=true eclipse:eclipse mvn package assembly:assembly -Dmaven.javadoc.skip=true setup ec2 creds export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=... run cluster bin/whirr launch-cluster --config recipes/puppet-nginx-ec2.properties test cluster for I in `awk ' {print $3} ' ~/.whirr/nginx/instances`;do nc $I 80 && echo socket $I 80 ok|| echo socket $I 80 timeout;curl --connect-timeout 1 --max-time 1 --silent http://$I >/dev/null 2>&1&& echo http://$I ok|| echo http://$I timeout;done shutdown cluster bin/whirr launch-cluster --config recipes/puppet-nginx-ec2.properties Note that when the cluster is running, you can ssh into the instance as your default user to look around
          Hide
          Adrian Cole added a comment - - edited

          latest patch with recipe that includes firewall rules (which is the patch tested in the above comment)

          Show
          Adrian Cole added a comment - - edited latest patch with recipe that includes firewall rules (which is the patch tested in the above comment)
          Hide
          Alex Heneveld added a comment -

          Looks like the puppet nginx wants vhost info from somewhere. The following puppet recipe makes and serves a simple site:

          include nginx
          
          file { '/tmp/www': ensure => directory }
          file { '/tmp/www/mysite': ensure => directory }
          file {'/tmp/www/mysite/index.html':
            ensure => present,
            content => "<i>hello world<i>",
          }
          
          nginx::resource::vhost { 'mysite': www_root => '/tmp/www/mysite' }
          

          Deploying a puppet:nginx role isn't very useful it seems. We want to write our own module, say mysite with the above in the mysite/manifests/init.pp (or getting a more interesting site from somewhere...); then we can ask whirr for a puppet:mysite role (remembering to set the module-dir for stdlib, nginx, and mysite).

          Is this a more realistic scenario?

          Show
          Alex Heneveld added a comment - Looks like the puppet nginx wants vhost info from somewhere. The following puppet recipe makes and serves a simple site: include nginx file { '/tmp/www': ensure => directory } file { '/tmp/www/mysite': ensure => directory } file {'/tmp/www/mysite/index.html': ensure => present, content => "<i>hello world<i>", } nginx::resource::vhost { 'mysite': www_root => '/tmp/www/mysite' } Deploying a puppet:nginx role isn't very useful it seems. We want to write our own module, say mysite with the above in the mysite/manifests/init.pp (or getting a more interesting site from somewhere...); then we can ask whirr for a puppet:mysite role (remembering to set the module-dir for stdlib, nginx, and mysite). Is this a more realistic scenario?
          Hide
          Adrian Cole added a comment -

          I agree we need a more realistic scenario. should we add something into the services/puppet tree that does the above, perhaps?

          Show
          Adrian Cole added a comment - I agree we need a more realistic scenario. should we add something into the services/puppet tree that does the above, perhaps?
          Hide
          Chad Metcalf added a comment -

          How about a brain dead httpd and ntp example?

          whirr.cluster-name=puppettest
          
          puppet.httpd.module-dir=git://github.com/metcalfc/puppet-httpd.git
          puppet.ntp.module-dir=git://github.com/puppetlabs/puppetlabs-ntp.git
          puppet.ntp.servers=[ '0.pool.ntp.org' ]
          puppet.ntp.autoupdate=true
          
          whirr.instance-templates=1 puppet:httpd+puppet:ntp
          whirr.firewall.rules=80
          
          whirr.provider=${sys:whirr.test.provider}
          whirr.identity=${sys:whirr.test.identity}
          whirr.credential=${sys:whirr.test.credential}
          whirr.hardware-min-ram=1024  
          

          Curl localhost:80 should work and testing ntp could be as easy as grepping /etc/ntp.conf for "pool.ntp.org".

          Show
          Chad Metcalf added a comment - How about a brain dead httpd and ntp example? whirr.cluster-name=puppettest puppet.httpd.module-dir=git: //github.com/metcalfc/puppet-httpd.git puppet.ntp.module-dir=git: //github.com/puppetlabs/puppetlabs-ntp.git puppet.ntp.servers=[ '0.pool.ntp.org' ] puppet.ntp.autoupdate= true whirr.instance-templates=1 puppet:httpd+puppet:ntp whirr.firewall.rules=80 whirr.provider=${sys:whirr.test.provider} whirr.identity=${sys:whirr.test.identity} whirr.credential=${sys:whirr.test.credential} whirr.hardware-min-ram=1024 Curl localhost:80 should work and testing ntp could be as easy as grepping /etc/ntp.conf for "pool.ntp.org".
          Hide
          Adrian Cole added a comment -

          thanks, chad. This didn't work right off the bat, and I found errors parsing puppet site.cfg

          ex.
          root@ip-10-100-203-137:/tmp/jclouds-script-1316279615688# cat stderr.log
          stdin: is not a tty
          Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class httpd at /etc/puppet/manifests/site.pp:3 on node ip-10-100-203-137.ec2.internal

          I think we need some more unit tests wrt generating these things. I'll clean up the patch and throw some down.

          Show
          Adrian Cole added a comment - thanks, chad. This didn't work right off the bat, and I found errors parsing puppet site.cfg ex. root@ip-10-100-203-137:/tmp/jclouds-script-1316279615688# cat stderr.log stdin: is not a tty Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class httpd at /etc/puppet/manifests/site.pp:3 on node ip-10-100-203-137.ec2.internal I think we need some more unit tests wrt generating these things. I'll clean up the patch and throw some down.
          Hide
          Adrian Cole added a comment -

          I'm also going to clean up some things, as troubleshooting is currently too difficult:

          1. Module currently forks a process and has some rather interesting code in order to support recursive dir -> blobstore. i don't think we should add as a requirement of this issue, and if we did, I'd use a java lib like shrinkwrap. Supporting tgz + git is plenty, IMHO.

          2. puppet.X.module-dir isn't really a valid key, since can be git or tgz. If anything, it is module-src, or better just module. going to change to puppet.X.module unless someone feels strongly against.

          3. many of the classes that generate Statements don't have test cases ex. Module.

          4. PuppetClusterActionHandler has too many responsibilities and very little test coverage. It is hard to follow what is going on, and testing via amazon is clunky.

          5. there are a bunch of references to sudo explicitly inside statements. Going to remove these, as normal workflow ensures root is used.

          Give me about 2 hours and I should have the above corrected.

          Show
          Adrian Cole added a comment - I'm also going to clean up some things, as troubleshooting is currently too difficult: 1. Module currently forks a process and has some rather interesting code in order to support recursive dir -> blobstore. i don't think we should add as a requirement of this issue, and if we did, I'd use a java lib like shrinkwrap. Supporting tgz + git is plenty, IMHO. 2. puppet.X.module-dir isn't really a valid key, since can be git or tgz. If anything, it is module-src, or better just module. going to change to puppet.X.module unless someone feels strongly against. 3. many of the classes that generate Statements don't have test cases ex. Module. 4. PuppetClusterActionHandler has too many responsibilities and very little test coverage. It is hard to follow what is going on, and testing via amazon is clunky. 5. there are a bunch of references to sudo explicitly inside statements. Going to remove these, as normal workflow ensures root is used. Give me about 2 hours and I should have the above corrected.
          Hide
          Chad Metcalf added a comment -

          Turns out half the problem is I can't remember how to use the module I wrote... Here is a working properties file:

          # Change the cluster name here
          whirr.cluster-name=puppettest
          
          # Change the number of machines in the cluster here
          whirr.instance-templates=1 puppet:apache+puppet:ntp
          whirr.firewall-rules=80
          
          # For EC2 set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.
          whirr.provider=aws-ec2
          whirr.identity=${env:AWS_ACCESS_KEY_ID}
          whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
          
          # with puppet, you need to specify where to pull the modules from
          puppet.apache.module-dir=git://github.com/metcalfc/puppet-apache.git
          puppet.ntp.module-dir=git://github.com/puppetlabs/puppetlabs-ntp.git
          
          # with puppet, you can set class parameters
          ntp.servers=[ '0.pool.ntp.org', 'clock.redhat.com' ]
          ntp.autoupdate=true
          
          # By default use the user system SSH keys. Override them here.
          whirr.private-key-file=${sys:user.home}/.ssh/id_whirr
          whirr.public-key-file=${whirr.private-key-file}.pub
          
          # Expert: specify alternate module sources instead of from git
          #puppet.nginx.module-dir=/tmp/git-puppetlabs-nginx.tgz
          #puppet.nginx.module-dir=/tmp/git-puppetlabs-nginx    
          

          When I wrote my module I used apache not httpd. I've renamed the github repo to reflect this and updated the above properties file.

          Also we found that on Ubuntu during the install you'll need to run apt-get update.

          Show
          Chad Metcalf added a comment - Turns out half the problem is I can't remember how to use the module I wrote... Here is a working properties file: # Change the cluster name here whirr.cluster-name=puppettest # Change the number of machines in the cluster here whirr.instance-templates=1 puppet:apache+puppet:ntp whirr.firewall-rules=80 # For EC2 set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. whirr.provider=aws-ec2 whirr.identity=${env:AWS_ACCESS_KEY_ID} whirr.credential=${env:AWS_SECRET_ACCESS_KEY} # with puppet, you need to specify where to pull the modules from puppet.apache.module-dir=git: //github.com/metcalfc/puppet-apache.git puppet.ntp.module-dir=git: //github.com/puppetlabs/puppetlabs-ntp.git # with puppet, you can set class parameters ntp.servers=[ '0.pool.ntp.org', 'clock.redhat.com' ] ntp.autoupdate= true # By default use the user system SSH keys. Override them here. whirr. private -key-file=${sys:user.home}/.ssh/id_whirr whirr. public -key-file=${whirr. private -key-file}.pub # Expert: specify alternate module sources instead of from git #puppet.nginx.module-dir=/tmp/git-puppetlabs-nginx.tgz #puppet.nginx.module-dir=/tmp/git-puppetlabs-nginx When I wrote my module I used apache not httpd . I've renamed the github repo to reflect this and updated the above properties file. Also we found that on Ubuntu during the install you'll need to run apt-get update .
          Hide
          Adrian Cole added a comment -

          refactored patch that includes fix from Chad

          Show
          Adrian Cole added a comment - refactored patch that includes fix from Chad
          Hide
          Adrian Cole added a comment -

          thanks for the working configuration, chad! I updated puppet to run apt-get update on install.

          I refactored massively so that methods make as much sense as possible, with much less cruft. I also added test cases for nearly all.

          note: syntax for module location is puppet.X.module

          I tested the following works now, OOB!

          • run cluster
            bin/whirr launch-cluster --config recipes/puppet-http-ec2.properties
          • test cluster
            for I in `awk ' {print $3}

            ' ~/.whirr/puppettest/instances`;do nc $I 80 && echo socket $I 80 ok|| echo socket $I 80 timeout;curl --connect-timeout 1 --max-time 1 --silent http://$I >/dev/null 2>&1&& echo http://$I ok|| echo http://$I timeout;done

          • shutdown cluster
            bin/whirr destroy-cluster --config recipes/puppet-http-ec2.properties

          NEXT STEPS:

          • go ahead and re-test, as all is working, now.
          • one last hairy method should be cleaned up: CreateSitePpAndApplyRoles.getManifestForClusterSpecAndRole
          Show
          Adrian Cole added a comment - thanks for the working configuration, chad! I updated puppet to run apt-get update on install. I refactored massively so that methods make as much sense as possible, with much less cruft. I also added test cases for nearly all. note: syntax for module location is puppet.X.module I tested the following works now, OOB! run cluster bin/whirr launch-cluster --config recipes/puppet-http-ec2.properties test cluster for I in `awk ' {print $3} ' ~/.whirr/puppettest/instances`;do nc $I 80 && echo socket $I 80 ok|| echo socket $I 80 timeout;curl --connect-timeout 1 --max-time 1 --silent http://$I >/dev/null 2>&1&& echo http://$I ok|| echo http://$I timeout;done shutdown cluster bin/whirr destroy-cluster --config recipes/puppet-http-ec2.properties NEXT STEPS: go ahead and re-test, as all is working, now. one last hairy method should be cleaned up: CreateSitePpAndApplyRoles.getManifestForClusterSpecAndRole
          Hide
          Chad Metcalf added a comment -

          We talked about the syntax for module branches being:

          puppet.ntp.module=git@github.com/puppetlabs/puppet-ntp.git
          puppet.ntp.module.branch=notmaster
          

          Did that make it in? I looked over the patch and didn't find on a first glance.

          Show
          Chad Metcalf added a comment - We talked about the syntax for module branches being: puppet.ntp.module=git@github.com/puppetlabs/puppet-ntp.git puppet.ntp.module.branch=notmaster Did that make it in? I looked over the patch and didn't find on a first glance.
          Hide
          Adrian Cole added a comment -

          will finish the patch now. thx

          Show
          Adrian Cole added a comment - will finish the patch now. thx
          Hide
          Adrian Cole added a comment -

          patch includes git branch support

          Show
          Adrian Cole added a comment - patch includes git branch support
          Hide
          Adrian Cole added a comment -

          module.branch functionality is in. I'm out of time to refactor CreateSitePpAndApplyRoles.getManifestForClusterSpecAndRole.

          Let's go ahead and review for inclusion

          Show
          Adrian Cole added a comment - module.branch functionality is in. I'm out of time to refactor CreateSitePpAndApplyRoles.getManifestForClusterSpecAndRole. Let's go ahead and review for inclusion
          Hide
          Andrei Savu added a comment -

          +1 by looking at the code (I haven't tried running the tests). I think it's safe to commit if the tests are passing both on cloudservers and ec2 and we can fix any remaining issues in follow-up JIRAs. I'm excited about this change!

          Show
          Andrei Savu added a comment - +1 by looking at the code (I haven't tried running the tests). I think it's safe to commit if the tests are passing both on cloudservers and ec2 and we can fix any remaining issues in follow-up JIRAs. I'm excited about this change!
          Hide
          Alex Heneveld added a comment - - edited

          +1 code changes look good to me; nay, they look great. Much better test coverage too, and all passing for me (on EC2).

          Developers note, checkstyle complains about missing newlines at the end of InstallAllModulesStatementFromProperties and KeyToModuleNameOrNull; and of course you need a recent checkout (one containing the WHIRR-371 firewall patch, or apply it yourself before applying this one).

          Users note, with this patch module sources must now come from git or local tgz's; local dirs are no longer supported. (It wasn't a particularly nice implementation, delegating to Runtime.exec to make the tar.gz; if this functionality is desired please let us know).

          Show
          Alex Heneveld added a comment - - edited +1 code changes look good to me; nay, they look great. Much better test coverage too, and all passing for me (on EC2). Developers note, checkstyle complains about missing newlines at the end of InstallAllModulesStatementFromProperties and KeyToModuleNameOrNull ; and of course you need a recent checkout (one containing the WHIRR-371 firewall patch, or apply it yourself before applying this one). Users note, with this patch module sources must now come from git or local tgz's; local dirs are no longer supported. (It wasn't a particularly nice implementation, delegating to Runtime.exec to make the tar.gz; if this functionality is desired please let us know).
          Hide
          Adrian Cole added a comment -

          committed after testing on ec2 and rackspace

          Show
          Adrian Cole added a comment - committed after testing on ec2 and rackspace

            People

            • Assignee:
              Adrian Cole
              Reporter:
              Alex Heneveld
            • Votes:
              3 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development