Whirr
  1. Whirr
  2. WHIRR-391

Write a YARN (Hadoop MR2) service for Whirr

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.2
    • Component/s: new service, service/hadoop
    • Labels:
      None
    1. WHIRR-391.patch
      152 kB
      Tom White
    2. WHIRR-391.patch
      64 kB
      Tom White
    3. WHIRR-391.patch
      38 kB
      Tom White
    4. WHIRR-391.patch
      40 kB
      Tom White
    5. WHIRR-391.patch
      41 kB
      Tom White
    6. WHIRR-391-0.7.1.patch
      84 kB
      Tom White
    7. WHIRR-391-0.7.1.patch
      47 kB
      Tom White
    8. WHIRR-391-0.7.patch
      144 kB
      Tom White

      Issue Links

        Activity

        Hide
        Tom White added a comment -

        Here's a patch to add a YARN service. You can start it using

        bin/whirr launch-cluster --config recipes/hadoop-yarn-ec2.properties
        

        Once it has started, log in to the head node and run a MR job:

        # Currently need to run as hadoop
        sudo su - hadoop
        
        hadoop fs -put $HADOOP_CONF_DIR input
        
        hadoop jar $HADOOP_HOME/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar wordcount \
        -Dmapreduce.job.user.name=$USER \
        -Dmapreduce.clientfactory.class.name=org.apache.hadoop.mapred.YarnClientFactory \
        -libjars $YARN_HOME/modules/hadoop-mapreduce-client-jobclient-0.24.0-SNAPSHOT.jar \
        input output
        

        Current limitations:

        Show
        Tom White added a comment - Here's a patch to add a YARN service. You can start it using bin/whirr launch-cluster --config recipes/hadoop-yarn-ec2.properties Once it has started, log in to the head node and run a MR job: # Currently need to run as hadoop sudo su - hadoop hadoop fs -put $HADOOP_CONF_DIR input hadoop jar $HADOOP_HOME/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar wordcount \ -Dmapreduce.job.user.name=$USER \ -Dmapreduce.clientfactory.class.name=org.apache.hadoop.mapred.YarnClientFactory \ -libjars $YARN_HOME/modules/hadoop-mapreduce-client-jobclient-0.24.0-SNAPSHOT.jar \ input output Current limitations: It changes the Hadoop service so that the MR daemons are not started. This needs fixing before committing. There is no proxy, so you have to log into the head node to use the cluster. It's still tricky to install Hadoop with YARN (see HADOOP-7642 , http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/INSTALL ), so this limitation is OK for now.
        Hide
        Tom White added a comment -

        Separating HDFS and MR daemons is covered by WHIRR-337.

        Show
        Tom White added a comment - Separating HDFS and MR daemons is covered by WHIRR-337 .
        Hide
        Tom White added a comment -

        Here's a slightly improved version which uses the unified tarball built by HADOOP-7642.

        Show
        Tom White added a comment - Here's a slightly improved version which uses the unified tarball built by HADOOP-7642 .
        Hide
        Andrei Savu added a comment -

        Tom is this something we should consider for inclusion now or later as the other issues are committed (e.g. HADOOP-7642)?

        Show
        Andrei Savu added a comment - Tom is this something we should consider for inclusion now or later as the other issues are committed (e.g. HADOOP-7642 )?
        Hide
        Tom White added a comment -

        Andrei, we should wait until WHIRR-342 is done before committing this. HADOOP-7642 is a dependency too, and should be committed soon.

        Show
        Tom White added a comment - Andrei, we should wait until WHIRR-342 is done before committing this. HADOOP-7642 is a dependency too, and should be committed soon.
        Hide
        Tom White added a comment -

        Updated to reflect changes in WHIRR-325.

        Show
        Tom White added a comment - Updated to reflect changes in WHIRR-325 .
        Hide
        Tom White added a comment -

        Here's an updated patch that basically works. There are some rough edges (history server doesn't work for example), and no integration test, but I successfully ran a MapReduce job using the provided recipe, and the following (slightly different paths from Hadoop 0.20/1.x).

        export HADOOP_HOME=...
        export HADOOP_CONF_DIR=~/.whirr/hadoop-yarn/
        export PATH=$PATH:$HADOOP_HOME/bin
        
        hadoop fs -put $HADOOP_HOME/LICENSE.txt input
        hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.1.jar wordcount input output
        
        Show
        Tom White added a comment - Here's an updated patch that basically works. There are some rough edges (history server doesn't work for example), and no integration test, but I successfully ran a MapReduce job using the provided recipe, and the following (slightly different paths from Hadoop 0.20/1.x). export HADOOP_HOME=... export HADOOP_CONF_DIR=~/.whirr/hadoop-yarn/ export PATH=$PATH:$HADOOP_HOME/bin hadoop fs -put $HADOOP_HOME/LICENSE.txt input hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.1.jar wordcount input output
        Hide
        Tom White added a comment -

        New patch which fixes the history server and adds an integration test. The test currently fails with a timeout although the MR job completes successfully on the cluster.

        Show
        Tom White added a comment - New patch which fixes the history server and adds an integration test. The test currently fails with a timeout although the MR job completes successfully on the cluster.
        Hide
        Andrei Savu added a comment -

        Looks good to me. Are we expecting a full public release soon? (the patch is still referencing hadoop-0.23.1-rc2)

        Show
        Andrei Savu added a comment - Looks good to me. Are we expecting a full public release soon? (the patch is still referencing hadoop-0.23.1-rc2)
        Hide
        Tom White added a comment -

        Thanks for taking a look. Yes, there should be a public release in the next few days, so I'll wait until that comes out before committing.

        Show
        Tom White added a comment - Thanks for taking a look. Yes, there should be a public release in the next few days, so I'll wait until that comes out before committing.
        Hide
        Tom White added a comment -

        This new patch works with CDH4b1 as well as Apache Hadoop 0.23.1 (now released). I've had some trouble getting the integration tests to work, but I can successfully run jobs manually against MR2 started with the example recipes. I'll open another issue to get the integration tests working - I think this is useful enough as is to commit so others can start trying it out. I also need to produce a trunk version.

        Show
        Tom White added a comment - This new patch works with CDH4b1 as well as Apache Hadoop 0.23.1 (now released). I've had some trouble getting the integration tests to work, but I can successfully run jobs manually against MR2 started with the example recipes. I'll open another issue to get the integration tests working - I think this is useful enough as is to commit so others can start trying it out. I also need to produce a trunk version.
        Hide
        Andrei Savu added a comment -

        Should we start to think about a 0.7.2 release that would include this?

        Show
        Andrei Savu added a comment - Should we start to think about a 0.7.2 release that would include this?
        Hide
        Tom White added a comment -

        Yes, I think that would be great.

        Show
        Tom White added a comment - Yes, I think that would be great.
        Hide
        Andrei Savu added a comment -

        Adding on the roadmap for 0.8.0 & 0.7.2.

        Show
        Andrei Savu added a comment - Adding on the roadmap for 0.8.0 & 0.7.2.
        Hide
        Andrei Savu added a comment -

        +1 for 0.7.2 but we need to get integration tests to work as expected.

        Show
        Andrei Savu added a comment - +1 for 0.7.2 but we need to get integration tests to work as expected.
        Hide
        Tom White added a comment -

        The integration test was failing due to HDFS-3068. I updated the code with the workaround described there and the test now passes. This patch also includes all the relevant changes for CDH4b1 (old CDH tests are now in a cdh-oldtests module), and all of those integration tests pass too. (I ran all tests several times over.)

        From my point of view I think this is ready to be committed.

        I'll create another patch for trunk.

        Show
        Tom White added a comment - The integration test was failing due to HDFS-3068 . I updated the code with the workaround described there and the test now passes. This patch also includes all the relevant changes for CDH4b1 (old CDH tests are now in a cdh-oldtests module), and all of those integration tests pass too. (I ran all tests several times over.) From my point of view I think this is ready to be committed. I'll create another patch for trunk.
        Hide
        Tom White added a comment -

        Here's the equivalent patch for trunk. WHIRR-528 clashes slightly with this, but I changed the code to use yum not retry_yum, and it works fine. When WHIRR-528 is done all usages can be updated at that point.

        I ran all the new integration tests and they passed.

        Show
        Tom White added a comment - Here's the equivalent patch for trunk. WHIRR-528 clashes slightly with this, but I changed the code to use yum not retry_yum, and it works fine. When WHIRR-528 is done all usages can be updated at that point. I ran all the new integration tests and they passed.
        Hide
        Tom White added a comment -

        Unless there are any objections, I'd like to commit this tomorrow.

        Show
        Tom White added a comment - Unless there are any objections, I'd like to commit this tomorrow.
        Hide
        Andrei Savu added a comment -

        +1 for committing to 0.7 after changing INSTANCES.containsKey("config") to INSTANCES.containsKey(config) in HadoopServiceController. I will go through the patch one more time later today.

        For 0.8.0 I think we need a different more clean approach. I would love to be able to replace:

        +# Change the number of machines in the cluster here
        +whirr.instance-templates=1 hadoop-namenode+yarn-resourcemanager+mapreduce-historyserver,1 hadoop-datanode+yarn-nodemanager
        +
        +# We need to use modified scripts for the installation since it has changed
        +# significantly since 0.20.x
        +whirr.java.install-function=install_oracle_jdk6
        +whirr.hadoop.install-function=install_cdh_hadoop
        +whirr.hadoop.configure-function=configure_cdh_hadoop
        +whirr.yarn.configure-function=configure_cdh_yarn
        +whirr.yarn.start-function=start_cdh_yarn
        +whirr.mr_jobhistory.start-function=start_cdh_mr_jobhistory
        +whirr.env.repo=cdh4
        

        ... with ...

        whirr.instance-templates=1 cdh-yarn-namenode+cdh-yarn-resourcemanager+cdh-yarn-mapreduce-historyserver, 1 cdh-yarn-hadoop+cdh-yarn-nodemanager
        

        This is less error prone and more consistent. I think we should view the ability to override install / configure functions as an advanced feature not a common mechanism for installing different flavors.

        Show
        Andrei Savu added a comment - +1 for committing to 0.7 after changing INSTANCES.containsKey("config") to INSTANCES.containsKey(config) in HadoopServiceController. I will go through the patch one more time later today. For 0.8.0 I think we need a different more clean approach. I would love to be able to replace: +# Change the number of machines in the cluster here +whirr.instance-templates=1 hadoop-namenode+yarn-resourcemanager+mapreduce-historyserver,1 hadoop-datanode+yarn-nodemanager + +# We need to use modified scripts for the installation since it has changed +# significantly since 0.20.x +whirr.java.install-function=install_oracle_jdk6 +whirr.hadoop.install-function=install_cdh_hadoop +whirr.hadoop.configure-function=configure_cdh_hadoop +whirr.yarn.configure-function=configure_cdh_yarn +whirr.yarn.start-function=start_cdh_yarn +whirr.mr_jobhistory.start-function=start_cdh_mr_jobhistory +whirr.env.repo=cdh4 ... with ... whirr.instance-templates=1 cdh-yarn-namenode+cdh-yarn-resourcemanager+cdh-yarn-mapreduce-historyserver, 1 cdh-yarn-hadoop+cdh-yarn-nodemanager This is less error prone and more consistent. I think we should view the ability to override install / configure functions as an advanced feature not a common mechanism for installing different flavors.
        Hide
        Tom White added a comment -

        Thanks for taking a look Andrei. I'll fix the containsKey error - well spotted!

        > I think we should view the ability to override install / configure functions as an advanced feature not a common mechanism for installing different flavors.

        I totally agree, and I opened WHIRR-559 for this. I think this can be done as a follow up issue though, since this one is big enough already. I'll work on it next.

        Show
        Tom White added a comment - Thanks for taking a look Andrei. I'll fix the containsKey error - well spotted! > I think we should view the ability to override install / configure functions as an advanced feature not a common mechanism for installing different flavors. I totally agree, and I opened WHIRR-559 for this. I think this can be done as a follow up issue though, since this one is big enough already. I'll work on it next.
        Hide
        Andrei Savu added a comment -

        +1

        Show
        Andrei Savu added a comment - +1
        Hide
        Tom White added a comment -

        I just committed this.

        Show
        Tom White added a comment - I just committed this.

          People

          • Assignee:
            Tom White
            Reporter:
            Tom White
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development