Hi,

my name is Christoph Nagel. I'm student on technical university Berlin and participating on the course of Isabel Drost and Sebastian Schelter.

My work is to implement the pagerank-algorithm, where the pagerank-vector fits in memory.

For the computation I used the naive algorithm shown in the book 'Mining of Massive Datasets' from Rajaraman & Ullman (http://www-scf.usc.edu/~csci572/2012Spring/UllmanMiningMassiveDataSets.pdf).

Matrix- and vector-multiplication are done with mahout methods.

Most work is the transformation the input graph, which has to consists of a nodes- and edges file.

Format of nodes file: <node>\n

Format of edges file: <startNode>\t<endNode>\n

Therefore I created the following classes:

- LineIndexer: assigns each line an index
- EdgesToIndex: indexes the nodes of the edges
- EdgesIndexToTransitionMatrix: creates the transition matrix
- Pagerank: computes PR from transition matrix
- JoinNodesWithPagerank: creates the joined output
- PagerankExampleJob: does the complete job

Each class has a test (not PagerankExampleJob) and I took the example of the book for evaluating.

My bad, didn't know that Mahout org.apache.mahout.math.Matrix and her friends were so full-featured. Thanks. Then it shouldn't be any problem.

Actually I had come across #

~~MAHOUT-879~~(Remove all graph algorithms with the exception of PageRank) and was just checking with you if large-scale sparse mat-vec mult and PageRank implementations in MapReduce are welcome.