Issue Details (XML | Word | Printable)

Key: HADOOP-4382
Type: Test Test
Status: Open Open
Priority: Major Major
Assignee: Tom White
Reporter: Tom White
Votes: 1
Watchers: 8
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Run Hadoop sort benchmark on Amazon EC2

Created: 09/Oct/08 08:10 AM   Updated: 01/Dec/08 05:12 PM
Return to search
Component/s: contrib/ec2
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works hadoop-4382-v2.patch 2008-11-27 01:33 PM Tom White 4 kB
Text File Licensed for inclusion in ASF works hadoop-4382.patch 2008-11-26 05:39 PM Tom White 3 kB
Issue Links:
Reference
 

Hadoop Flags: Reviewed


 Description  « Hide
By running a benchmark on EC2 we can see how well Hadoop performs, how to tune it, and how performance changes between releases.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Tom White added a comment - 26/Nov/08 05:39 PM
A script that:

1. Launches a cluster on EC2
2. Waits for the cluster and Hadoop daemons to start
3. Runs a small sort job to warm up the cluster
4. Runs a sort job and emits the job duration
5. Terminates the cluster

Running on an 8 node cluster it took 2742 seconds to sort 32GB of data using the default hadoop-site.xml that the EC2 scripts use. This could be improved by using better settings.

There are several improvements that could be made to the script, in particular in detecting when the cluster is ready to go (the current script waits until 90% of the nodes are up then waits 1 minute for Hadoop to start). There are more ideas here: http://www.nabble.com/Auto-shutdown-for-EC2-clusters-td20132561.html It would also be good to do multiple runs, discard the first and compute an average.

This should be a good basis for running a regular EC2 benchmark from Hudson.

Comments welcome.


Tom White made changes - 26/Nov/08 05:39 PM
Field Original Value New Value
Attachment hadoop-4382.patch [ 12394766 ]
Tom White added a comment - 26/Nov/08 05:46 PM
I should say that the 8 node cluster used large EC2 instances (and the namenode/jobtracker is not included in the 8 nodes).

Nigel Daley added a comment - 26/Nov/08 10:37 PM
Looks good Tom. A couple comments:
  • should we also run sortvalidation to ensure the sort actually worked?
  • what bin dir are you putting the script in?
  • perhaps name the script sort-benchmark
  • add a line to echo the # minutes into a file as follows for Hudson plot:

    sort_minutes=`expr $

    Unknown macro: {sort_duration}
    / 60`
    echo "YVALUE=$
    Unknown macro: {sort_minutes}
    " > sort_minutes.properties


Nigel Daley added a comment - 26/Nov/08 10:41 PM
Argh, Jira wiki notation ate my code snippet.
sort_minutes=`expr ${sort_duration} / 60`
echo "YVALUE=${sort_minutes}" > sort_minutes.properties

Tom White added a comment - 27/Nov/08 01:33 PM
Thanks for the comments Nigel.

New patch incorporating the suggestions. (I've created the patch from the base of Hadoop this time, so the script goes in src/contrib/ec2/bin.)


Tom White made changes - 27/Nov/08 01:33 PM
Attachment hadoop-4382-v2.patch [ 12394849 ]
Tom White made changes - 01/Dec/08 02:57 PM
Link This issue relates to HADOOP-4745 [ HADOOP-4745 ]
Nigel Daley added a comment - 01/Dec/08 05:12 PM
+1

Nigel Daley made changes - 01/Dec/08 05:12 PM
Hadoop Flags [Reviewed]