Whirr
  1. Whirr
  2. WHIRR-225

Support locally-supplied scripts

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: None
    • Labels:
      None

      Description

      Whirr currently uses runurl to download bootstrap and configuration scripts from a webserver. It should be possible to use scripts supplied from the launch machine (or even from another source).

      1. WHIRR-225.patch
        135 kB
        Tom White
      2. WHIRR-225.patch
        75 kB
        Tom White
      3. WHIRR-225.patch
        55 kB
        Tom White
      4. WHIRR-225.patch
        4 kB
        Tom White

        Activity

        Hide
        Tom White added a comment -

        We can implement this by using jclouds scriptbuilder classes. For example, by putting bash functions in files named after the function in src/main/resources/functions we can call them with

        addStatement(event, call("function-name"));
        

        It's also makes it possible for users to override the function by editing the file in functions, e.g. functions/function-name.sh, (since the CLI puts the top-level directory on the front of the classpath, so any functions defined here will take precedence to the ones in the JAR).

        If we refactor the existing scripts to be functions like this, then we will no longer have to serve them from a webserver. However, for larger clusters it may be preferable to download files from a webserver, so that the client doesn't have to send a potentially large script to each node in the cluster. Instead it could upload the script to a blobstore, then send a runurl command to each node in the cluster (see https://github.com/jclouds/jclouds/raw/master/demos/test/ComputeAndBlobStoreTogetherHappilyLiveTest.java for one way of doing this). We could do this enhancement as a separate JIRA.

        I've attached a patch which shows how this would look for an example script function. It runs a (possibly user-defined) script in the bootstrap phase of ZooKeeper.

        Thoughts?

        Show
        Tom White added a comment - We can implement this by using jclouds scriptbuilder classes. For example, by putting bash functions in files named after the function in src/main/resources/functions we can call them with addStatement(event, call( "function-name" )); It's also makes it possible for users to override the function by editing the file in functions , e.g. functions/function-name.sh , (since the CLI puts the top-level directory on the front of the classpath, so any functions defined here will take precedence to the ones in the JAR). If we refactor the existing scripts to be functions like this, then we will no longer have to serve them from a webserver. However, for larger clusters it may be preferable to download files from a webserver, so that the client doesn't have to send a potentially large script to each node in the cluster. Instead it could upload the script to a blobstore, then send a runurl command to each node in the cluster (see https://github.com/jclouds/jclouds/raw/master/demos/test/ComputeAndBlobStoreTogetherHappilyLiveTest.java for one way of doing this). We could do this enhancement as a separate JIRA. I've attached a patch which shows how this would look for an example script function. It runs a (possibly user-defined) script in the bootstrap phase of ZooKeeper. Thoughts?
        Hide
        Adrian Cole added a comment -

        I like this approach. BlobStore functionality is an option we can use under the scenes as necessary. In other words, I'd start with doing everything in Statements and then optimize with BlobStore under the scenes as necessary.

        Show
        Adrian Cole added a comment - I like this approach. BlobStore functionality is an option we can use under the scenes as necessary. In other words, I'd start with doing everything in Statements and then optimize with BlobStore under the scenes as necessary.
        Hide
        Lars George added a comment -

        Would this also mean deprecating and eventually removing the runurl keys from each service? I see no reason to keep them to be honest and we would need to split the code into two, one supporting runurl the other the scriptbuilder API.

        Show
        Lars George added a comment - Would this also mean deprecating and eventually removing the runurl keys from each service? I see no reason to keep them to be honest and we would need to split the code into two, one supporting runurl the other the scriptbuilder API.
        Hide
        Tom White added a comment -

        > Would this also mean deprecating and eventually removing the runurl keys from each service?

        Yes, the scripts would be re-written as functions and we would stop hosting them on S3. For larger clusters we would implement the blobstore optimization later.

        Show
        Tom White added a comment - > Would this also mean deprecating and eventually removing the runurl keys from each service? Yes, the scripts would be re-written as functions and we would stop hosting them on S3. For larger clusters we would implement the blobstore optimization later.
        Hide
        Tom White added a comment -

        First go at this. HBase, CDH don't yet work.

        Show
        Tom White added a comment - First go at this. HBase, CDH don't yet work.
        Hide
        Tom White added a comment -

        This version passes integration tests. It doesn't remove the scripts directory, but we should do this as a part of this issue. There are some Cassandra scripts in scripts/apache/cassandra (nodetool, start, stop, wipe-state) that are not used - do we need to keep them, or should we delete them until we have some way of running them on the cluster?

        Show
        Tom White added a comment - This version passes integration tests. It doesn't remove the scripts directory, but we should do this as a part of this issue. There are some Cassandra scripts in scripts/apache/cassandra (nodetool, start, stop, wipe-state) that are not used - do we need to keep them, or should we delete them until we have some way of running them on the cluster?
        Hide
        Adrian Cole added a comment -

        +1

        tested on
        aws-s3 cassandra,zookeeper,hadoop,hbase
        cloudservers-us cassandra,zookeeper,hadoop

        note hbase config doesn't seem setup for cloudservers, conceding this hasn't anything to do with this issue

        Show
        Adrian Cole added a comment - +1 tested on aws-s3 cassandra,zookeeper,hadoop,hbase cloudservers-us cassandra,zookeeper,hadoop note hbase config doesn't seem setup for cloudservers, conceding this hasn't anything to do with this issue
        Hide
        Lars George added a comment -

        +1

        Thanks Tom for doing this.

        @Adrian:

        note hbase config doesn't seem setup for cloudservers, conceding this hasn't anything to do with this issue

        What do you mean?

        Show
        Lars George added a comment - +1 Thanks Tom for doing this. @Adrian: note hbase config doesn't seem setup for cloudservers, conceding this hasn't anything to do with this issue What do you mean?
        Hide
        Adrian Cole added a comment -

        I meant that the test properties file has parameters that are only relevant to ec2. Ex. locationid. this shouldn't be specified in rackspace and overriding it to nothing is a bit awkward. thanks for opening WHIRR-233 on this. Again, this isn't anything new or specific to WHIRR-225

        Show
        Adrian Cole added a comment - I meant that the test properties file has parameters that are only relevant to ec2. Ex. locationid. this shouldn't be specified in rackspace and overriding it to nothing is a bit awkward. thanks for opening WHIRR-233 on this. Again, this isn't anything new or specific to WHIRR-225
        Hide
        Andrei Savu added a comment -

        Looks great. I believe that in core/src/main/resources/functions/install_runurl.sh we should rename installRunUrl to install_runurl. Why are we keeping this function around?

        Show
        Andrei Savu added a comment - Looks great. I believe that in core/src/main/resources/functions/install_runurl.sh we should rename installRunUrl to install_runurl . Why are we keeping this function around?
        Hide
        Tom White added a comment -

        > we should rename installRunUrl to install_runurl.

        The patch should do this (I'll make sure it happens at commit time).

        > Why are we keeping this function around?

        We should still allow scripts to use runurl if they wish, so I think we should keep it.

        I'll commit this later today. Thanks for the reviews everyone.

        Show
        Tom White added a comment - > we should rename installRunUrl to install_runurl. The patch should do this (I'll make sure it happens at commit time). > Why are we keeping this function around? We should still allow scripts to use runurl if they wish, so I think we should keep it. I'll commit this later today. Thanks for the reviews everyone.
        Hide
        Tom White added a comment -

        Minor update that removes the scripts (except for the extra Cassandra ones), and reinstates a test teardown I had removed during testing.

        Show
        Tom White added a comment - Minor update that removes the scripts (except for the extra Cassandra ones), and reinstates a test teardown I had removed during testing.
        Hide
        Tom White added a comment -

        I've just committed this.

        Show
        Tom White added a comment - I've just committed this.

          People

          • Assignee:
            Tom White
            Reporter:
            Tom White
          • Votes:
            2 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development