Hadoop Common
  1. Hadoop Common
  2. HADOOP-1301

resource management proviosioning for Hadoop

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0
    • Fix Version/s: 0.16.0
    • Component/s: None
    • Labels:
      None

      Description

      The Hadoop On Demand (HOD) project addresses the provisioning and managing of MapReduce instances on cluster resources. With HOD, the MapReduce user interacts with the cluster solely through a self-service interface and the JT, TT info ports. The user never needs to log into the cluster or even have an account on the cluster for that matter. HOD allocates nodes, provisions MapReduce (and optionally HDFS) on the cluster and when the user is done with MapReduce jobs, cleanly shuts down MapReduce and de-allocates the nodes (i.e., re-introducing them to the pool of available resources in the cluster).

      Using HOD, a cluster can be shared among different users in a fair and efficient manner. HOD is not a replacement or re-implementation of a traditional resource manager. HOD is implemented using the resource manager paradigm and at present is envisioned supporting Torque and Condor out of the box. It also supports "static" resources, i.e., a dedicated set of resources not using a resource manager.

      HOD is also self provisioning and, thus, can be used on systems such as EC2 or a campus cluster not already running MapReduce software or a resouce manager. Figure 1 depicts a cluster using HOD. As the figure shows, the user never logs into the cluster itself. The user's jobs run as the 'hod' user (a configurable unix id).

      The user interacts with MapReduce and the cluster using the hod shell, hodsh. Once in the hodsh, the user can allocate/de-allocate nodes and automatically run JT, TTs, NN, DNs on those nodes without knowing the specifics of which nodes are running which or logging into any of those boxes. HOD transparently masks failures by allocating nodes to replace failed nodes. Once the user has allocated nodes, she can run /bin/MapReduce my1.jar and then /bin/MapReduce my2.jar ... from within the hod shell which automatically generates the configuration file for the MapReduce script. When done, the user will exit the shell.

      The hod shell has an automatic timeout so that users cannot hog resources they aren't using. The timeout applies only when there is no MapReduce job running. In addition, hod also has the option of tracking and enforcing user/group resource limits.

      Optionally, HOD can run dedicated log and directory services in the cluster. The log services are a central repository for collecting and retrieving Hadoop logs for any given job. The directory service provides an easy way to inspect what's running in the cluster or for the end user and html interfacing for getting to their JT and TT info ports.

      1. hod.0.2.2.tar.gz
        64 kB
        Pete Wyckoff
      2. hod-hadoop.patch
        417 kB
        Hemanth Yamijala
      3. hod-hadoop.v2.patch
        417 kB
        Hemanth Yamijala
      4. hod-hadoop.v3.patch
        447 kB
        Hemanth Yamijala
      5. hod-hadoop.v4.patch
        448 kB
        Hemanth Yamijala
      6. hod-open-4.tar.gz
        84 kB
        Hemanth Yamijala

        Issue Links

          Activity

          Hide
          Pete Wyckoff added a comment -

          The python code that implements HOD.

          Note there is a .hodrc file in the conf directory that should be copied to your home directory.

          Show
          Pete Wyckoff added a comment - The python code that implements HOD. Note there is a .hodrc file in the conf directory that should be copied to your home directory.
          Hide
          Jim Kellerman added a comment -

          I don't think that Hadoop On Demand (HOD) is related to HBase more than tangentially. (see the HBase page on the Hadoop Wiki (http://wiki.apache.org/lucene-hadoop/Hbase), so I am moving it to the map-reduce component which is where it appears to be more relevant.

          Show
          Jim Kellerman added a comment - I don't think that Hadoop On Demand (HOD) is related to HBase more than tangentially. (see the HBase page on the Hadoop Wiki ( http://wiki.apache.org/lucene-hadoop/Hbase ), so I am moving it to the map-reduce component which is where it appears to be more relevant.
          Hide
          Michael Bieniosek added a comment -

          It would be nice to have some documentation about how to install this.

          Show
          Michael Bieniosek added a comment - It would be nice to have some documentation about how to install this.
          Hide
          Hemanth Yamijala added a comment -

          This is the latest work-in-progress version of HOD. This version provides out-of-the-box integration with the Torque resource manager. In the tarball, you can find the following documentation to help install, configure and run HOD:

          • README: Brief description of HOD.
          • getting_started.txt: Brief instructions to quickly get you started on using HOD. It covers installation, basic configuration and commands.
          • config.txt: More details on various important configuration options.

          Appreciate any comments if you can try this out.

          Plese note though, that some significant changes (particularly to the user interface) are in the works and might obsolete this version.

          Show
          Hemanth Yamijala added a comment - This is the latest work-in-progress version of HOD. This version provides out-of-the-box integration with the Torque resource manager. In the tarball, you can find the following documentation to help install, configure and run HOD: README: Brief description of HOD. getting_started.txt: Brief instructions to quickly get you started on using HOD. It covers installation, basic configuration and commands. config.txt: More details on various important configuration options. Appreciate any comments if you can try this out. Plese note though, that some significant changes (particularly to the user interface) are in the works and might obsolete this version.
          Hide
          Hemanth Yamijala added a comment -

          The attached patch is the latest version of HOD, which we would like to submit under hadoop contrib. This version of HOD works with the Hadoop 0.16 trunk.

          The following files serve as the documentation for this patch:

          README: Gives an overview of HOD
          getting_started.txt: Gives instructions on how to try out HOD
          config.txt: A more detailed description of how to configure HOD.

          We request interested users to give it a try and provide us feedback.

          Show
          Hemanth Yamijala added a comment - The attached patch is the latest version of HOD, which we would like to submit under hadoop contrib. This version of HOD works with the Hadoop 0.16 trunk. The following files serve as the documentation for this patch: README: Gives an overview of HOD getting_started.txt: Gives instructions on how to try out HOD config.txt: A more detailed description of how to configure HOD. We request interested users to give it a try and provide us feedback.
          Hide
          Hemanth Yamijala added a comment -

          Ran tests and findbugs - just to make sure - they don't crib about HOD. Seem to be ignoring the new directory for the most part.

          Show
          Hemanth Yamijala added a comment - Ran tests and findbugs - just to make sure - they don't crib about HOD. Seem to be ignoring the new directory for the most part.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12371682/hod-hadoop.patch
          against trunk revision r604451.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs -1. The patch appears to cause Findbugs to fail.

          core tests -1. The patch failed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1360/testReport/
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1360/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1360/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12371682/hod-hadoop.patch against trunk revision r604451. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs -1. The patch appears to cause Findbugs to fail. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1360/testReport/ Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1360/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1360/console This message is automatically generated.
          Hide
          Hemanth Yamijala added a comment -

          Found the problem why the core tests and findbugs are failing. The hod build file needs to be modified. Working on a new patch.

          Show
          Hemanth Yamijala added a comment - Found the problem why the core tests and findbugs are failing. The hod build file needs to be modified. Working on a new patch.
          Hide
          Hemanth Yamijala added a comment -

          Fixed the ant build file adding a dependency on the package target to the compile target.

          Show
          Hemanth Yamijala added a comment - Fixed the ant build file adding a dependency on the package target to the compile target.
          Hide
          Arun C Murthy added a comment - - edited

          The following files serve as the documentation for this patch:

          README: Gives an overview of HOD
          getting_started.txt: Gives instructions on how to try out HOD
          config.txt: A more detailed description of how to configure HOD.

          Hemanth, could you please make them Apache Forrest based documentation? We could then put these up on the website etc.
          Thanks!

          Show
          Arun C Murthy added a comment - - edited The following files serve as the documentation for this patch: README: Gives an overview of HOD getting_started.txt: Gives instructions on how to try out HOD config.txt: A more detailed description of how to configure HOD. Hemanth, could you please make them Apache Forrest based documentation? We could then put these up on the website etc. Thanks!
          Hide
          Hemanth Yamijala added a comment -

          Patch addressing Arun's comment. This one has documentation in Forrest format. But the original text files with the documentation are also there, so one can start looking at them without having to build the Forrest based documentation (as there is nothing existing right now).

          Show
          Hemanth Yamijala added a comment - Patch addressing Arun's comment. This one has documentation in Forrest format. But the original text files with the documentation are also there, so one can start looking at them without having to build the Forrest based documentation (as there is nothing existing right now).
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12372008/hod-hadoop.v3.patch
          against trunk revision r605811.

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12372008/hod-hadoop.v3.patch against trunk revision r605811. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1403/console This message is automatically generated.
          Hide
          Karam Singh added a comment -

          Ran hod as per the instructions provided in getting_started.txt

          1.

          Under section "3. Setting up HOD"
          It is writeten that -:

          • HOD is available under the 'contrib' section of Hadoop under the root
            directory 'hod'.
          • Distribute the files under this directory to all the nodes in the
            cluster.

          It would be better if it also mentions that "hod should be installed on same locations on all nodes"

          2.
          For multi-valued comma separated parameter e.g "--ringmaster.work-dirs" it should be mentioned that there should be no space between commas

          3. Config.txt should also mentions the options mentioned in getting_started.txt

          Show
          Karam Singh added a comment - Ran hod as per the instructions provided in getting_started.txt 1. Under section "3. Setting up HOD" It is writeten that -: HOD is available under the 'contrib' section of Hadoop under the root directory 'hod'. Distribute the files under this directory to all the nodes in the cluster. It would be better if it also mentions that "hod should be installed on same locations on all nodes" 2. For multi-valued comma separated parameter e.g "--ringmaster.work-dirs" it should be mentioned that there should be no space between commas 3. Config.txt should also mentions the options mentioned in getting_started.txt
          Hide
          Hemanth Yamijala added a comment -

          This patch addresses the points Karam has raised as part of his review.

          Show
          Hemanth Yamijala added a comment - This patch addresses the points Karam has raised as part of his review.
          Hide
          Karam Singh added a comment -

          +1 overall
          +1 Document getting_started.txt : it now mentions that installation path for hod should be same on all nodes
          +1 Document config.txt : Mentions that special character such space, comma are not handled by hod as known issue
          +1 Document config.txt : Options mentioned in getting_started.txt are also present in config.txt
          +1 Default conf/hodrc : Space between comma separated values for --ringmaster.work-dirs has been removed

          Show
          Karam Singh added a comment - +1 overall +1 Document getting_started.txt : it now mentions that installation path for hod should be same on all nodes +1 Document config.txt : Mentions that special character such space, comma are not handled by hod as known issue +1 Document config.txt : Options mentioned in getting_started.txt are also present in config.txt +1 Default conf/hodrc : Space between comma separated values for --ringmaster.work-dirs has been removed
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12372452/hod-hadoop.v4.patch
          against trunk revision .

          @author +1. The patch does not contain any @author tags.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new compiler warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests -1. The patch failed contrib unit tests.

          Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/testReport/
          Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12372452/hod-hadoop.v4.patch against trunk revision . @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1462/console This message is automatically generated.
          Hide
          Hemanth Yamijala added a comment -

          JUnit test cases in HBase seem to have failed. As HOD is independent from that, it is unlikely to have caused the problem.

          Show
          Hemanth Yamijala added a comment - JUnit test cases in HBase seem to have failed. As HOD is independent from that, it is unlikely to have caused the problem.
          Hide
          Nigel Daley added a comment -

          I just committed this. Thanks Hemanth and Pete!

          Show
          Nigel Daley added a comment - I just committed this. Thanks Hemanth and Pete!

            People

            • Assignee:
              Hemanth Yamijala
              Reporter:
              Pete Wyckoff
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development