[HADOOP-2410] Make EC2 cluster nodes more independent of each other - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.16.1
Fix Version/s: 0.17.0
Component/s: contrib/cloud
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed
Release Note:

Hide
The command "hadoop-ec2 run" has been replaced by "hadoop-ec2 launch-cluster <group> <number of instances>", and "hadoop-ec2 start-hadoop" has been removed since Hadoop is started on instance start up. See http://wiki.apache.org/hadoop/AmazonEC2 for details.

Show
The command "hadoop-ec2 run" has been replaced by "hadoop-ec2 launch-cluster <group> <number of instances>", and "hadoop-ec2 start-hadoop" has been removed since Hadoop is started on instance start up. See http://wiki.apache.org/hadoop/AmazonEC2 for details.

Description

The cluster start up scripts currently wait for each node to start up before appointing a master (to run the namenode and jobtracker on), and copying private keys to all the nodes, and writing the private IP address of the master to the hadoop-site.xml file (which is then copied to the slaves via rsync). Only once this is all done is hadoop started on the cluster (from the master). This can fail if any of the nodes fails to come up, which can happen as EC2 doesn't guarantee that you get a cluster of the size you ask for (I've seen this happen).

The process would be more robust if each node was told the address of the master as user metadata and then started its own daemons. This is complicated by the fact that the public DNS alias of the master resolves to a public IP address so cannot be used by EC2 nodes (see http://docs.amazonwebservices.com/AWSEC2/2007-08-29/DeveloperGuide/instance-addressing.html). Instead we need to use a trick (http://developer.amazonwebservices.com/connect/message.jspa?messageID=71126#71126) to find the private IP, and what's more we need to attempt to resolve the private IP in a loop until it is available since the DNS will only be set up after the master has started.

This change will also mean the private key doesn't need to be copied to each node, which can be slow and has dubious security. Configuration can be handled using the mechanism described in ~~HADOOP-2409~~.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

concurrent-clusters.patch
28/Mar/08 05:29
36 kB
Chris Wensel
concurrent-clusters-2.patch
28/Mar/08 16:33
36 kB
Chris Wensel
ec2.tgz
28/Mar/08 16:55
7 kB
Chris Wensel
concurrent-clusters-3.patch
04/Apr/08 00:57
36 kB
Chris Wensel

Activity

People

Assignee:: Chris Wensel

Reporter:: Thomas White

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 12/Dec/07 15:24

Updated:: 04/Aug/15 03:43

Resolved:: 04/Apr/08 13:52