Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-457

ItemSimilarityJob and RecommenderJob don't work on Amazon ElasticMapReduce

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.4
    • None
    • None

    Description

      I'm currently evaluating ItemSimilarityJob and RecommenderJob on ElasticMapReduce, it seems we have some small problems with S3, mostly due to the fact that we need to use Filesystem.get(path.toUri(), conf) instead of Filesystem.get(conf) in the code. I will create a patch for that the next days.

      I'm writing this mail because I encountered another problem I currently can't solve. RecommenderJob is emulating MultipleInputs (which is currently missing in Hadoop 0.20 AFAIK) by reading data from a combined path that is built like that:

      new Path(prePartialMultiplyPath1 + "," + prePartialMultiplyPath2)

      My Job always fails with this exception here:

      java.lang.IllegalArgumentException: Invalid hostname in URI s3:/testingbucket-12345/tmp/prePartialMultiply2

      Attachments

        1. MAHOUT-457.patch
          11 kB
          Sebastian Schelter
        2. MAHOUT-457-2.patch
          12 kB
          Sebastian Schelter
        3. MAHOUT-457-3.patch
          15 kB
          Sebastian Schelter

        Activity

          People

            Unassigned Unassigned
            ssc Sebastian Schelter
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: