Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: new service
    • Labels:
      None

      Description

      Here is an initial patch to support Mahout as a Whirr service.

      I created the role 'mahout-home' which can be used to install the binary Mahout distribution on a Hadoop namenode.
      By combining this role with configuration for a Hadoop cluster you can SSH into the namenode, su to root and start running Mahout jobs via the mahout script immediately.

      The 'mahout-home' role has two properties

      Mahout version whirr.mahout.version
      URL of the Mahout binary distribution tarball whirr.mahout.tarball.url

      Note that I used a snapshot version of Mahout for testing, revision 1169784, because there were some problems with the Mahout script in 0.5 that have been fixed on trunk, see MAHOUT-680. To test you can set the tarball property to this link http://dl.dropbox.com/u/13436484/mahout-distribution-0.6-SNAPSHOT.tar.gz

      I used configure actions and the onBeforeConfigure(). If there is a better way to express this with the Whirr API let me know.

      Currently I am investigating a 'mahout-jar' role, which installs the Mahout examples job jar under $HADOOP_HOME/lib on a tasktracer node. I already have some code for putting the jar in place but when running a job from my local machine I still get ClassNotFoundExceptions. I believe this is because Hadoop has already started before the jar is put in the lib dir, so the jar won't be picked up, but I have to investigate some more. From WHIRR-221 I understood that there is no support (yet?) for ordering of services but if you have an idea on how to fix this let me know.

      Comments and suggestions welcome!

      1. WHIRR-384.patch
        27 kB
        Andrei Savu
      2. WHIRR-384-mahout-client.patch
        12 kB
        Frank Scholten
      3. WHIRR-384-mahout-home.patch
        13 kB
        Frank Scholten

        Activity

        Hide
        Tom White added a comment -

        This looks great! Thanks for submitting it Frank.

        • How does Mahout find the Hadoop cluster? Is there some extra configuration step needed? (Especially if you install on a node where Hadoop isn't installed.)
        • How about calling the role "mahout-client"? I used a similar term over at https://github.com/tomwhite/whirr-scm for the client installation.
        • Why do you need to install the Mahout examples JAR on the cluster at all? I would think you can submit it using "hadoop jar". Either way, this could be a follow on issue. We'd probably have to add something to Hadoop to allow extra JARs to be installed.
        • Are all the dependencies needed? E.g. I can't see where jsch is used. mvn dependency:analyze should help here.
        Show
        Tom White added a comment - This looks great! Thanks for submitting it Frank. How does Mahout find the Hadoop cluster? Is there some extra configuration step needed? (Especially if you install on a node where Hadoop isn't installed.) How about calling the role "mahout-client"? I used a similar term over at https://github.com/tomwhite/whirr-scm for the client installation. Why do you need to install the Mahout examples JAR on the cluster at all? I would think you can submit it using "hadoop jar". Either way, this could be a follow on issue. We'd probably have to add something to Hadoop to allow extra JARs to be installed. Are all the dependencies needed? E.g. I can't see where jsch is used. mvn dependency:analyze should help here.
        Hide
        Frank Scholten added a comment -

        Added new patch with 'mahout-client' role and without the unneeded dependencies.

        At the moment the 'mahout-client' role is oblivious to Hadoop. It unpacks the tarball and adds the mahout script to the path. The mahout script does have some checks, it looks for configuration in $HADOOP_HOME/conf but you still need to setup a Hadoop cluster.

        Before this patch I would point HADOOP_CONF_DIR to the Hadoop configuration generated by Whirr on my local machine and run jobs from there. I guess if Whirr could generate this config on another node under $HADOOP_HOME/conf and you give this node the 'mahout-client' you can submit mahout jobs from that node in the same way. The role does not have to be added to a namenode, the node just needs Hadoop configuration.

        About the 'mahout-jar' role, my idea was to create a cluster with the Mahout jar on tasktracker nodes so you could run a Mahout job from a Java process that has compile dependencies on Mahout without having to build a job jar that contains Mahout and its dependencies. I would like to be able to set up a Java project with dependencies on Whirr, Mahout and Hadoop and launch jobs from Java without building a job jar. However, if you this is problematic or not a good idea let me know.

        Show
        Frank Scholten added a comment - Added new patch with 'mahout-client' role and without the unneeded dependencies. At the moment the 'mahout-client' role is oblivious to Hadoop. It unpacks the tarball and adds the mahout script to the path. The mahout script does have some checks, it looks for configuration in $HADOOP_HOME/conf but you still need to setup a Hadoop cluster. Before this patch I would point HADOOP_CONF_DIR to the Hadoop configuration generated by Whirr on my local machine and run jobs from there. I guess if Whirr could generate this config on another node under $HADOOP_HOME/conf and you give this node the 'mahout-client' you can submit mahout jobs from that node in the same way. The role does not have to be added to a namenode, the node just needs Hadoop configuration. About the 'mahout-jar' role, my idea was to create a cluster with the Mahout jar on tasktracker nodes so you could run a Mahout job from a Java process that has compile dependencies on Mahout without having to build a job jar that contains Mahout and its dependencies. I would like to be able to set up a Java project with dependencies on Whirr, Mahout and Hadoop and launch jobs from Java without building a job jar. However, if you this is problematic or not a good idea let me know.
        Hide
        Andrei Savu added a comment - - edited

        I took a quick look and overall it looks good. Before committing we should also add an integration test that starts a Hadoop cluster and submits a Mahout job. Do you have some free cycles? I can help as needed.

        Show
        Andrei Savu added a comment - - edited I took a quick look and overall it looks good. Before committing we should also add an integration test that starts a Hadoop cluster and submits a Mahout job. Do you have some free cycles? I can help as needed.
        Hide
        Frank Scholten added a comment -

        Ok. No I don't have free cycles.

        Show
        Frank Scholten added a comment - Ok. No I don't have free cycles.
        Hide
        Andrei Savu added a comment -

        Thanks Frank for the work you've done so far. This is a great start!

        Show
        Andrei Savu added a comment - Thanks Frank for the work you've done so far. This is a great start!
        Hide
        Andrei Savu added a comment -

        This one still needs integration tests - moving to 0.8.0.

        Show
        Andrei Savu added a comment - This one still needs integration tests - moving to 0.8.0.
        Hide
        Frank Scholten added a comment -

        Andrei: I created a MahoutServiceTest but when I run it I get an 'Action handler not found'. Any ideas?

        See https://github.com/frankscholten/whirr/tree/WHIRR-384

        Show
        Frank Scholten added a comment - Andrei: I created a MahoutServiceTest but when I run it I get an 'Action handler not found'. Any ideas? See https://github.com/frankscholten/whirr/tree/WHIRR-384
        Hide
        Andrei Savu added a comment -

        Frank thanks for doing more work on this. I will get back with some feedback in 1-2 hours tonight.

        Show
        Andrei Savu added a comment - Frank thanks for doing more work on this. I will get back with some feedback in 1-2 hours tonight.
        Hide
        Andrei Savu added a comment -

        Here are some changes I've made to a local branch to get things to build and run:
        https://github.com/andreisavu/whirr/commit/7c105f02eff8aa7b98648b3d5ab6512161bff2e5

        Adding a file with logging settings is one of the remaining things to do.

        Show
        Andrei Savu added a comment - Here are some changes I've made to a local branch to get things to build and run: https://github.com/andreisavu/whirr/commit/7c105f02eff8aa7b98648b3d5ab6512161bff2e5 Adding a file with logging settings is one of the remaining things to do.
        Hide
        Andrei Savu added a comment -

        The integration test is failing for me on aws-ec2 with the following exception:

        Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 575.431 sec <<< FAILURE!
        testBuildReuters(org.apache.whirr.service.mahout.integration.MahoutServiceTest)  Time elapsed: 26.333 sec  <<< ERROR!
        java.util.NoSuchElementException: no nodes matched filter: And(runningInGroup(mahout-itest-aws-ec2-andreisavu),And(ALWAYS_TRUE,withIds([mahout-client])))
            at org.jclouds.compute.internal.BaseComputeService.nodesMatchingFilterAndNotTerminatedExceptionIfNotFound(BaseComputeService.java:328)
            at org.jclouds.compute.internal.BaseComputeService.runScriptOnNodesMatching(BaseComputeService.java:555)
            at org.apache.whirr.ClusterController.runScriptOnNodesMatching(ClusterController.java:194)
            at org.apache.whirr.ClusterController.runScriptOnNodesMatching(ClusterController.java:175)
            at org.apache.whirr.service.mahout.integration.MahoutServiceTest.testBuildReuters(MahoutServiceTest.java:77)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            ...
        

        I think this is happening because the following line:

            Predicate<NodeMetadata> mahoutClientRole = Predicates.and(alwaysTrue(), withIds("mahout-client"));
        

        should be something like this:

            Cluster.Instance mahoutInstance = getOnlyElement(filter(controller.getInstances(clusterSpec), role("mahout-client")));
            Predicate<NodeMetadata> mahoutClientRole = and(alwaysTrue(), withIds(mahoutInstance.getId()));
        
        Show
        Andrei Savu added a comment - The integration test is failing for me on aws-ec2 with the following exception: Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 575.431 sec <<< FAILURE! testBuildReuters(org.apache.whirr.service.mahout.integration.MahoutServiceTest) Time elapsed: 26.333 sec <<< ERROR! java.util.NoSuchElementException: no nodes matched filter: And(runningInGroup(mahout-itest-aws-ec2-andreisavu),And(ALWAYS_TRUE,withIds([mahout-client]))) at org.jclouds.compute.internal.BaseComputeService.nodesMatchingFilterAndNotTerminatedExceptionIfNotFound(BaseComputeService.java:328) at org.jclouds.compute.internal.BaseComputeService.runScriptOnNodesMatching(BaseComputeService.java:555) at org.apache.whirr.ClusterController.runScriptOnNodesMatching(ClusterController.java:194) at org.apache.whirr.ClusterController.runScriptOnNodesMatching(ClusterController.java:175) at org.apache.whirr.service.mahout.integration.MahoutServiceTest.testBuildReuters(MahoutServiceTest.java:77) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) ... I think this is happening because the following line: Predicate<NodeMetadata> mahoutClientRole = Predicates.and(alwaysTrue(), withIds( "mahout-client" )); should be something like this: Cluster.Instance mahoutInstance = getOnlyElement(filter(controller.getInstances(clusterSpec), role( "mahout-client" ))); Predicate<NodeMetadata> mahoutClientRole = and(alwaysTrue(), withIds(mahoutInstance.getId()));
        Hide
        Alex Heneveld added a comment -

        Do we also want mahout in cli/pom.xml ?

        Frank, nice work.

        Andrei, one other thought, or rather response to a potential smell – the role-filter expression. Is it time for a convenience class?

        ClusterScriptRunner.onCluster(cluster).onNodesWithRole("mahout-client").statement(...).withOptions(...).run()
        
        Show
        Alex Heneveld added a comment - Do we also want mahout in cli/pom.xml ? Frank, nice work. Andrei, one other thought, or rather response to a potential smell – the role-filter expression. Is it time for a convenience class? ClusterScriptRunner.onCluster(cluster).onNodesWithRole( "mahout-client" ).statement(...).withOptions(...).run()
        Hide
        Frank Scholten added a comment -

        Ok, I'll have another look.

        Show
        Frank Scholten added a comment - Ok, I'll have another look.
        Hide
        Frank Scholten added a comment -

        I now have a working integration test https://github.com/frankscholten/whirr/commits/WHIRR-384

        Show
        Frank Scholten added a comment - I now have a working integration test https://github.com/frankscholten/whirr/commits/WHIRR-384
        Hide
        Andrei Savu added a comment - - edited

        Thanks Frank! Here is a slightly updated patch (improved test logging, added as dep to CLI) extracted from your branch.

        +1 from me. Works like a charm both on aws-ec2 & cloudservers-uk.

        Show
        Andrei Savu added a comment - - edited Thanks Frank! Here is a slightly updated patch (improved test logging, added as dep to CLI) extracted from your branch. +1 from me. Works like a charm both on aws-ec2 & cloudservers-uk.
        Hide
        Andrei Savu added a comment -

        Let's ship this in 0.7.0.

        Show
        Andrei Savu added a comment - Let's ship this in 0.7.0.
        Hide
        David Alves added a comment -

        Awesome work guys.
        +1 applies and compiles cleanly

        PS: Not really relevant for this release but we should move assertResponsesContain() to some common location, I'm seeing it copied multiple times.

        Show
        David Alves added a comment - Awesome work guys. +1 applies and compiles cleanly PS: Not really relevant for this release but we should move assertResponsesContain() to some common location, I'm seeing it copied multiple times.
        Hide
        Frank Scholten added a comment -

        I tried using

        Cluster.Instance mahoutInstance = getOnlyElement(filter(controller.getInstances(clusterSpec), role("mahout-client")));
        

        but it returned null. Is this a bug or did I use the API incorrectly? I changed it to the code below to make it work.

        Cluster cluster = new ClusterStateStoreFactory().create(clusterSpec).load();
        return cluster.getInstanceMatching(anyRoleIn(newHashSet(MAHOUT_CLIENT_ROLE)));
        
        Show
        Frank Scholten added a comment - I tried using Cluster.Instance mahoutInstance = getOnlyElement(filter(controller.getInstances(clusterSpec), role( "mahout-client" ))); but it returned null. Is this a bug or did I use the API incorrectly? I changed it to the code below to make it work. Cluster cluster = new ClusterStateStoreFactory().create(clusterSpec).load(); return cluster.getInstanceMatching(anyRoleIn(newHashSet(MAHOUT_CLIENT_ROLE)));
        Hide
        Frank Scholten added a comment -

        If I recall correctly the filter method returned null and the getOnlyElement() method threw a NoSuchElementException

        Show
        Frank Scholten added a comment - If I recall correctly the filter method returned null and the getOnlyElement() method threw a NoSuchElementException
        Hide
        Andrei Savu added a comment -

        Frank this is not a bug, I think you have to use ClusterController.getInstances(ClusterSpec spec, ClusterStateStore stateStore) in filter(). The second approach is fine for now.

        Show
        Andrei Savu added a comment - Frank this is not a bug, I think you have to use ClusterController.getInstances(ClusterSpec spec, ClusterStateStore stateStore) in filter(). The second approach is fine for now.
        Hide
        Andrei Savu added a comment -

        Committed. Thanks Frank for making this happen. Thanks David for reviewing.

        Show
        Andrei Savu added a comment - Committed. Thanks Frank for making this happen. Thanks David for reviewing.

          People

          • Assignee:
            Frank Scholten
            Reporter:
            Frank Scholten
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development