Hadoop Common
  1. Hadoop Common
  2. HADOOP-4117

Improve configurability of Hadoop EC2 instances

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: contrib/cloud
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Changed scripts to pass initialization script for EC2 instances at boot time (as EC2 user data) rather than embedding initialization information in the EC2 image. This change makes it easy to customize the hadoop-site.xml file for your cluster before launch, by editing the hadoop-ec2-init-remote.sh script, or by setting the environment variable USER_DATA_FILE in hadoop-ec2-env.sh to run a script of your choice.
      Show
      Changed scripts to pass initialization script for EC2 instances at boot time (as EC2 user data) rather than embedding initialization information in the EC2 image. This change makes it easy to customize the hadoop-site.xml file for your cluster before launch, by editing the hadoop-ec2-init-remote.sh script, or by setting the environment variable USER_DATA_FILE in hadoop-ec2-env.sh to run a script of your choice.

      Description

      Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.

      It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

      1. hadoop-4117-v2.patch
        14 kB
        Tom White
      2. hadoop-4117.patch
        15 kB
        Tom White

        Activity

        Hide
        Tom White added a comment -

        This patch passes the boot script as user data to the EC2 instance on launch. This makes it easy to change config by editing the boot script. You can change other config in the script too. For example, it would be the obvious place to add set up code for HBase (see HBASE-838). I've also changed the script to fix HADOOP-3783.

        Show
        Tom White added a comment - This patch passes the boot script as user data to the EC2 instance on launch. This makes it easy to change config by editing the boot script. You can change other config in the script too. For example, it would be the obvious place to add set up code for HBase (see HBASE-838 ). I've also changed the script to fix HADOOP-3783 .
        Hide
        Chris K Wensel added a comment -

        ec2-run-user-data is not made executable, and isn't getting run on startup.

        [root@ip-10-251-203-243 init.d]# ls -la ec2-run-user-data
        rw-rr- 1 root root 1763 Sep 16 23:17 ec2-run-user-data

        Show
        Chris K Wensel added a comment - ec2-run-user-data is not made executable, and isn't getting run on startup. [root@ip-10-251-203-243 init.d] # ls -la ec2-run-user-data rw-r r - 1 root root 1763 Sep 16 23:17 ec2-run-user-data
        Hide
        Tom White added a comment -

        Chris,

        Thanks for pointing that out. I can set the execute bit when the scripts are committed. Apart from that, do the changes look OK?

        Show
        Tom White added a comment - Chris, Thanks for pointing that out. I can set the execute bit when the scripts are committed. Apart from that, do the changes look OK?
        Hide
        Chris K Wensel added a comment -

        well, it might be safer to have the script chmod the file on the server when pushed up to init.d.

        i'll chmod the file locally and try testing the scripts again tonight and see if it carries over scp reliably.

        Show
        Chris K Wensel added a comment - well, it might be safer to have the script chmod the file on the server when pushed up to init.d. i'll chmod the file locally and try testing the scripts again tonight and see if it carries over scp reliably.
        Hide
        Tom White added a comment -

        I did a local chmod and it worked - I've used these scripts a few times. But here's a new patch where the file's mode is changed on the instance, which should be more reliable (although scp -p would be an alternative).

        Show
        Tom White added a comment - I did a local chmod and it worked - I've used these scripts a few times. But here's a new patch where the file's mode is changed on the instance, which should be more reliable (although scp -p would be an alternative).
        Hide
        Chris K Wensel added a comment -

        +1 everything booted up this time. looks great

        Show
        Chris K Wensel added a comment - +1 everything booted up this time. looks great
        Hide
        Tom White added a comment -

        I've just committed this.

        Show
        Tom White added a comment - I've just committed this.
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/ )

          People

          • Assignee:
            Tom White
            Reporter:
            Tom White
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development