# Tutorial for playing with Mahout's Spark shell

## Details

• Type: Improvement
• Status: Closed
• Priority: Major
• Resolution: Fixed
• Affects Version/s: None
• Fix Version/s:
• Component/s:
• Labels:
None

## Description

I have a created a tutorial for setting up the spark shell and implementing a simple linear regression algorithm. I'd love to make this part of the website, could someone give it a review?

https://github.com/sscdotopen/krams/blob/master/linear-regression-cereals.md

PS: If you wanna try out the code, you have to add the patch from MAHOUT-1532 to your sources.

## Activity

Hide
Sebastian Schelter added a comment -

Updated tutorial to also mention caching.

Show
Sebastian Schelter added a comment - Updated tutorial to also mention caching.
Hide
Andrew Palumbo added a comment -

Sebastian Schelter, I followed the tutorial step by step and everything worked without any issues. I found it very easy to follow. The cut and paste ols example worked for me easily. Very nice!

Show
Andrew Palumbo added a comment - Sebastian Schelter , I followed the tutorial step by step and everything worked without any issues. I found it very easy to follow. The cut and paste ols example worked for me easily. Very nice!
Hide
Dmitriy Lyubimov added a comment -

This is super cool.

One note i would add is that X is tall and skinny (so that X'X fits in memory but X is not). Otherwise it looks like the real thing all hapens in-core but this is not really the case.

Show
Dmitriy Lyubimov added a comment - This is super cool. One note i would add is that X is tall and skinny (so that X'X fits in memory but X is not). Otherwise it looks like the real thing all hapens in-core but this is not really the case.
Hide
Dmitriy Lyubimov added a comment -

we also should probably modify shell so that this import is not needed.

import org.apache.mahout.math.Vector

Show
Dmitriy Lyubimov added a comment - we also should probably modify shell so that this import is not needed. import org.apache.mahout.math.Vector
Hide
Sebastian Schelter added a comment -

added to the website. I also added a new top navigation point called "Spark". Shout if you don't like that naming.

Show
Sebastian Schelter added a comment - added to the website. I also added a new top navigation point called "Spark". Shout if you don't like that naming.
Hide
Dmitriy Lyubimov added a comment -

Sebastian Schelter do you mind if i rewrite the math symbols in latex/mathjax?

Show
Dmitriy Lyubimov added a comment - Sebastian Schelter do you mind if i rewrite the math symbols in latex/mathjax?
Hide
Sebastian Schelter added a comment -

No, go ahead, thats a great idea.

Show
Sebastian Schelter added a comment - No, go ahead, thats a great idea.
Hide
Dmitriy Lyubimov added a comment -

done in stage but for some reason it doesn't publish site for me. CMS infra problems again perhaps. Staging looks fine.

Show
Dmitriy Lyubimov added a comment - done in stage but for some reason it doesn't publish site for me. CMS infra problems again perhaps. Staging looks fine.
Hide

FAILURE: Integrated in Mahout-Quality #2610 (See https://builds.apache.org/job/Mahout-Quality/2610/)
MAHOUT-1542 Tutorial for playing with Mahout's Spark shell (ssc: rev 1595595)

• /mahout/trunk/CHANGELOG
• /mahout/trunk/spark-shell/src/main/scala/org/apache/mahout/sparkbindings/shell/MahoutSparkILoop.scala
Show
Hudson added a comment - FAILURE: Integrated in Mahout-Quality #2610 (See https://builds.apache.org/job/Mahout-Quality/2610/ ) MAHOUT-1542 Tutorial for playing with Mahout's Spark shell (ssc: rev 1595595) /mahout/trunk/CHANGELOG /mahout/trunk/spark-shell/src/main/scala/org/apache/mahout/sparkbindings/shell/MahoutSparkILoop.scala

## People

• Assignee:
Sebastian Schelter
Reporter:
Sebastian Schelter