Hadoop Common
  1. Hadoop Common
  2. HADOOP-2411

Add support for larger EC2 instance types

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.17.0
    • Fix Version/s: 0.19.0
    • Component/s: contrib/cloud
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added support for c1.* instance types and associated kernels for EC2.

      Description

      Need to configure Hadoop to exploit the resources available on larger instance types: 64bit, extra CPUs, larger memory. See http://docs.amazonwebservices.com/AWSEC2/2007-08-29/DeveloperGuide/instance-types.html

      1. hadoop-ec2.patch
        8 kB
        Chris K Wensel
      2. hadoop-ec2.v2.patch
        9 kB
        Chris K Wensel

        Issue Links

          Activity

          Hide
          Chris K Wensel added a comment -

          Reusing this issue.

          The current contrib/ec2 scripts do support larger instances, this patch adds c1.* support and will use the recommended kernels for those instances.

          Unfortunately this is hard coded, it would be nice to make this a little more dynamic (by fetching the current meta-data from hadoop wiki?)

          also, these files should be deleted
          D src/contrib/ec2/bin/login-hadoop-cluster
          D src/contrib/ec2/bin/destroy-hadoop-cluster

          Show
          Chris K Wensel added a comment - Reusing this issue. The current contrib/ec2 scripts do support larger instances, this patch adds c1.* support and will use the recommended kernels for those instances. Unfortunately this is hard coded, it would be nice to make this a little more dynamic (by fetching the current meta-data from hadoop wiki?) also, these files should be deleted D src/contrib/ec2/bin/login-hadoop-cluster D src/contrib/ec2/bin/destroy-hadoop-cluster
          Hide
          Chris K Wensel added a comment -

          just noticed the default is c1.medium. might change that back to m1.small before committing.

          Show
          Chris K Wensel added a comment - just noticed the default is c1.medium. might change that back to m1.small before committing.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12385458/hadoop-ec2.patch
          against trunk revision 674910.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12385458/hadoop-ec2.patch against trunk revision 674910. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2815/console This message is automatically generated.
          Hide
          Owen O'Malley added a comment -

          Tom, can you review this patch, please?

          Was the Groovy addition required? Why?

          Show
          Owen O'Malley added a comment - Tom, can you review this patch, please? Was the Groovy addition required? Why?
          Hide
          Chris K Wensel added a comment -

          Groovy is not required. Must have leaked in from my version of the scripts.

          Show
          Chris K Wensel added a comment - Groovy is not required. Must have leaked in from my version of the scripts.
          Hide
          Tom White added a comment -

          These are good changes (with the groovy stuff removed, and the INSTANCE_TYPE changed back to small), but the intent of this Jira was to configure Hadoop appropriately according to the instance type. Larger instances have more RAM and more CPUs so we could change properties like fs.inmemory.size.mb, io.sort.mb and mapred.tasktracker.(map|reduce).tasks.maximum according to the instance size. Unfortunately, the configuration story for Hadoop on EC2 needs some work (for example, you can't rsync changes around a cluster), and we should concentrate on fixing this. Another approach is HADOOP-2409.

          Show
          Tom White added a comment - These are good changes (with the groovy stuff removed, and the INSTANCE_TYPE changed back to small), but the intent of this Jira was to configure Hadoop appropriately according to the instance type. Larger instances have more RAM and more CPUs so we could change properties like fs.inmemory.size.mb, io.sort.mb and mapred.tasktracker.(map|reduce).tasks.maximum according to the instance size. Unfortunately, the configuration story for Hadoop on EC2 needs some work (for example, you can't rsync changes around a cluster), and we should concentrate on fixing this. Another approach is HADOOP-2409 .
          Hide
          Owen O'Malley added a comment -

          Tom, Could you review this one? Thanks!

          Show
          Owen O'Malley added a comment - Tom, Could you review this one? Thanks!
          Hide
          Tom White added a comment -

          Cancelling until the groovy changes are removed.

          Show
          Tom White added a comment - Cancelling until the groovy changes are removed.
          Hide
          Chris K Wensel added a comment -

          I'll submit a new patch after HADOOP-4117 is working for me.

          Show
          Chris K Wensel added a comment - I'll submit a new patch after HADOOP-4117 is working for me.
          Hide
          Chris K Wensel added a comment -
          • relies on hadoop-4117
          • adds new instance types
          • adds support for new kernels
          • removes groovy install
          • removes unused files
          • fixes bug with hadoop-ec2 list command
          • updates hadoop-ec2 comments
          Show
          Chris K Wensel added a comment - relies on hadoop-4117 adds new instance types adds support for new kernels removes groovy install removes unused files fixes bug with hadoop-ec2 list command updates hadoop-ec2 comments
          Hide
          Tom White added a comment -

          I've just committed this. Thanks Chris!

          I tested the patch with a medium CPU instance and it ran fine. I made a small change to list-hadoop-clusters so it works for group names with dashes in them.

          Show
          Tom White added a comment - I've just committed this. Thanks Chris! I tested the patch with a medium CPU instance and it ran fine. I made a small change to list-hadoop-clusters so it works for group names with dashes in them.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #611 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/ )

            People

            • Assignee:
              Chris K Wensel
              Reporter:
              Tom White
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development