Mahout
  1. Mahout
  2. MAHOUT-94

Make the Taste Demo more automated.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Would be really cool if the Taste Demo (http://lucene.apache.org/mahout/taste.html#demo) was easier to get up and going. For instance, we could have an Ant task that automatically gets the data and puts it into the work directory just like we do for Reuters, Wikipedia and Twenty News. Then, we could also ship Jetty w/ the examples, such that one just needs to do

      java -jar start.jar

      to fire up the WAR and have it running (Solr does this)

      1. axis.tar
        1.83 MB
        Grant Ingersoll
      2. jetty.tar
        6.29 MB
        Grant Ingersoll
      3. MAHOUT-94.patch
        46 kB
        Grant Ingersoll
      4. MAHOUT-94.patch
        45 kB
        Grant Ingersoll

        Issue Links

          Activity

          Hide
          Sean Owen added a comment -

          One issue with getting the data automatically is that GroupLens has asked to not include their data with the distro, since they need those that use it to acknowledge their license terms. Automatically grabbing it also seems to go around that.

          The jetty part is possible, sure. There's yet another little catch that may make it hard to make this plug-and-play and that is that the demo needs a large heap, which is not available by default, which means setting command line options. That could be taken care of in an ant target I guess.

          Unless I remember incorrectly this will also entail sticking Axis in the distro since the web app that gets generated exposes a web service via a .jws file.

          Show
          Sean Owen added a comment - One issue with getting the data automatically is that GroupLens has asked to not include their data with the distro, since they need those that use it to acknowledge their license terms. Automatically grabbing it also seems to go around that. The jetty part is possible, sure. There's yet another little catch that may make it hard to make this plug-and-play and that is that the demo needs a large heap, which is not available by default, which means setting command line options. That could be taken care of in an ant target I guess. Unless I remember incorrectly this will also entail sticking Axis in the distro since the web app that gets generated exposes a web service via a .jws file.
          Hide
          Grant Ingersoll added a comment -

          One issue with getting the data automatically is that GroupLens has asked to not include their data with the distro, since they need those that use it to acknowledge their license terms. Automatically grabbing it also seems to go around that.

          Hmm, not sure here. We could, in the demo instructions, tell them to read the README first. Also, I don't think we are violating any of their terms. We aren't redistributing and we aren't using it commercially. The only part we haven't done is let them know of our usage.

          The jetty part is possible, sure. There's yet another little catch that may make it hard to make this plug-and-play and that is that the demo needs a large heap, which is not available by default, which means setting command line options. That could be taken care of in an ant target I guess.

          What I have so far is just automating the setup and packaging, but I don't actually start Jetty.

          I've excluded the Axis stuff for now, but that will need to be worked in. I'll post a patch shortly.

          Show
          Grant Ingersoll added a comment - One issue with getting the data automatically is that GroupLens has asked to not include their data with the distro, since they need those that use it to acknowledge their license terms. Automatically grabbing it also seems to go around that. Hmm, not sure here. We could, in the demo instructions, tell them to read the README first. Also, I don't think we are violating any of their terms. We aren't redistributing and we aren't using it commercially. The only part we haven't done is let them know of our usage. The jetty part is possible, sure. There's yet another little catch that may make it hard to make this plug-and-play and that is that the demo needs a large heap, which is not available by default, which means setting command line options. That could be taken care of in an ant target I guess. What I have so far is just automating the setup and packaging, but I don't actually start Jetty. I've excluded the Axis stuff for now, but that will need to be worked in. I'll post a patch shortly.
          Hide
          Grant Ingersoll added a comment -

          Here's the Jetty libs.
          Still need to update NOTICES and LICENSE

          Show
          Grant Ingersoll added a comment - Here's the Jetty libs. Still need to update NOTICES and LICENSE
          Hide
          Grant Ingersoll added a comment -

          Start of a patch

          Show
          Grant Ingersoll added a comment - Start of a patch
          Hide
          Sean Owen added a comment -

          OK with the patch so far, though it removes the web services stuff and that should be undone.

          You mention this issue is about merging the other build file into the main one. Fine by me; I had kept it separate to separate my component-specific stuff but it could easily be merged into one with a copy-and-paste operation. I had also held back just because the main build file is organized a bit differently than I would have done or prefer – some of it is personal taste – so putting them together either means having two different styles of build script in the same file, or else, changing one to look like the other. I didn't really want to alter how I set up the targets, but hadn't yet felt the mandate to try changing the main build file.

          might as well dig in and start by just putting them together, sure.

          Show
          Sean Owen added a comment - OK with the patch so far, though it removes the web services stuff and that should be undone. You mention this issue is about merging the other build file into the main one. Fine by me; I had kept it separate to separate my component-specific stuff but it could easily be merged into one with a copy-and-paste operation. I had also held back just because the main build file is organized a bit differently than I would have done or prefer – some of it is personal taste – so putting them together either means having two different styles of build script in the same file, or else, changing one to look like the other. I didn't really want to alter how I set up the targets, but hadn't yet felt the mandate to try changing the main build file. might as well dig in and start by just putting them together, sure.
          Hide
          Grant Ingersoll added a comment -

          OK with the patch so far, though it removes the web services stuff and that should be undone.

          Totally agree. I was on the plane when I did the patch and couldn't download the jars but wanted to get things working.

          As for merging the builds together, I think it is a good thing to do. The simpler the better for people, right now it still feels too separate. I can see the other algorithms being part of a WAR, possibly, if people just run them on a single machine, or else we'll have admin stuff like Hadoop.

          Show
          Grant Ingersoll added a comment - OK with the patch so far, though it removes the web services stuff and that should be undone. Totally agree. I was on the plane when I did the patch and couldn't download the jars but wanted to get things working. As for merging the builds together, I think it is a good thing to do. The simpler the better for people, right now it still feels too separate. I can see the other algorithms being part of a WAR, possibly, if people just run them on a single machine, or else we'll have admin stuff like Hadoop.
          Hide
          Grant Ingersoll added a comment -

          here's the Axis libs. Untar in core/lib (it will create an axis directory

          Show
          Grant Ingersoll added a comment - here's the Axis libs. Untar in core/lib (it will create an axis directory
          Hide
          Grant Ingersoll added a comment -

          Here's some updates

          Show
          Grant Ingersoll added a comment - Here's some updates
          Hide
          Grant Ingersoll added a comment -

          I have asked the Grouplens people permission to automatically download the content.

          Show
          Grant Ingersoll added a comment - I have asked the Grouplens people permission to automatically download the content.
          Hide
          Grant Ingersoll added a comment - - edited

          This is mostly automated now with the Maven automation and we have decided not to automate the GroupLens download.

          Show
          Grant Ingersoll added a comment - - edited This is mostly automated now with the Maven automation and we have decided not to automate the GroupLens download.

            People

            • Assignee:
              Grant Ingersoll
              Reporter:
              Grant Ingersoll
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development