[HADOOP-4585] unused and misleading configuration in hadoop-init - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Won't Fix
Affects Version/s: 0.18.1
Fix Version/s: None
Component/s: contrib/cloud
Labels:
None

Description

src/contrib/ec2/bin/image/hadoop-init is appended to rc.local on all
ec2 cluster boxes. This shell script generates the hadoop-site.xml
configuration file. It starts with some default settings, which are
used to populate the file. These defaults are then overwritten by the
user data (from hadoop-ec2-env.sh) passed to the EC2 instance by
launch-hadoop-master and launch-hadoop-slaves.

This isn't a bug; setting variables in hadoop-ec2-env.sh does the
right thing. However, it's dead and misleading code (well, it misled
me) and running a test Hadoop job to figure out what's going on takes
a little effort.

Suggested change to hadoop-init:

Remove these lines:

# set defaults
MAX_TASKS=3
[ "$INSTANCE_TYPE" == "m1.large" ] && MAX_TASKS=6
[ "$INSTANCE_TYPE" == "m1.xlarge" ] && MAX_TASKS=12

MAX_MAP_TASKS=$MAX_TASKS
MAX_REDUCE_TASKS=$MAX_TASKS

Add a comment before the lines which access the user data:

# get user data passed in by the ec2 instance launch
wget -q -O - http://169.254.169.254/latest/user-data | tr ',' '\n' > /tmp/user-data
source /tmp/user-data

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Karl Lehenbauer

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 04/Nov/08 01:09

Updated:: 23/Apr/09 19:25

Resolved:: 20/Jan/09 16:28