Index: src/site/site.xml =================================================================== --- src/site/site.xml (revision 1301308) +++ src/site/site.xml (working copy) @@ -68,6 +68,7 @@ + Index: src/site/xdoc/index.xml =================================================================== --- src/site/xdoc/index.xml (revision 1301308) +++ src/site/xdoc/index.xml (working copy) @@ -66,6 +66,7 @@
  • Getting Started with Hama.
  • Launch a Hama cluster on Clouds.
  • Hama BSP Tutorial.
  • +
  • Hama Graph Tutorial.
  • Learn about Hama and BSP by reading the documentation.
  • Index: src/site/xdoc/hama_graph_tutorial.xml =================================================================== --- src/site/xdoc/hama_graph_tutorial.xml (revision 0) +++ src/site/xdoc/hama_graph_tutorial.xml (revision 0) @@ -0,0 +1,109 @@ + + + + + Graph Tutorial + +
    +

    This document describes the Graph computing framework and serves as a tutorial.

    + +

    Hama includes the Graph package for vertex-centric graph computations. + Hama's Graph package allows you to program Google's Pregel style applications with simple programming interface.

    + + + +

    Writing a Hama graph application involves subclassing the predefined Vertex class. Its template arguments define message value type.

    +
    +  public abstract class Vertex<M extends Writable> implements VertexInterface<M> {
    +
    +    public void compute(Iterator<MSGTYPE> messages) throws IOException;
    +    ..
    +
    +  }
    + +

    The user overrides the Compute() method, which will be executed at each active vertex in every superstep. Predefined Vertex methods allow Compute() to query information about the current vertex and its edges, and to send messages to other vertices. Compute() can inspect the value associated with its vertex via GetValue().

    + + +

    To solve the Page Rank problem using Hama Graph, you can extends the Vertex class to create a PageRankVertex class. +In this example, the algorithm described Google's Pregel paper was used. The value of a vertex represents the tentative page rank of the vertex. The graph is intialized with each vertex value equal to 1/numOfVertices. In each of the first 30 supersteps, each vertex sends its tentative page rank along all of its outgoing edges. +

    +From Superstep 1 to 30, each vertex sums up the values arriving on all its messages and sets its tentative page rank to (1 - 0.85) / numOfVertices + (0.85 * sum). +

    + +
    +  public static class PageRankVertex extends Vertex<DoubleWritable> {
    +
    +    @Override
    +    public void compute(Iterator<DoubleWritable> messages) throws IOException {
    +      if (this.getSuperstepCount() == 0) {
    +        this.setValue(new DoubleWritable(1.0 / (double) this.getNumVertices()));
    +      }
    +
    +      if (this.getSuperstepCount() >= 1) {
    +        double sum = 0;
    +        while (messages.hasNext()) {
    +          DoubleWritable msg = messages.next();
    +          sum += msg.get();
    +        }
    +
    +        double ALPHA = (1 - 0.85) / (double) this.getNumVertices();
    +        this.setValue(new DoubleWritable(ALPHA + (0.85 * sum)));
    +      }
    +
    +      if (this.getSuperstepCount() < this.getMaxIteration()) {
    +        int numEdges = this.getOutEdges().size();
    +        sendMessageToNeighbors(new DoubleWritable(this.getValue().get()
    +            / numEdges));
    +      }
    +    }
    +
    +    public static void main(String[] args) throws IOException,
    +        InterruptedException, ClassNotFoundException {
    +      HamaConfiguration conf = new HamaConfiguration(new Configuration());
    +      GraphJob pageJob = new GraphJob(conf, PageRank.class);
    +      pageJob.setJobName("Pagerank");
    +
    +      pageJob.setVertexClass(PageRankVertex.class);
    +      pageJob.setMaxIteration(30);
    +
    +      pageJob.setInputPath(new Path(args[0]));
    +      pageJob.setOutputPath(new Path(args[1]));
    +
    +      if (args.length == 3) {
    +        pageJob.setNumBspTask(Integer.parseInt(args[2]));
    +      }
    +
    +      pageJob.setInputFormat(SequenceFileInputFormat.class);
    +      pageJob.setPartitioner(HashPartitioner.class);
    +      pageJob.setOutputFormat(SequenceFileOutputFormat.class);
    +      pageJob.setOutputKeyClass(Text.class);
    +      pageJob.setOutputValueClass(DoubleWritable.class);
    +
    +      long startTime = System.currentTimeMillis();
    +      if (pageJob.waitForCompletion(true)) {
    +        System.out.println("Job Finished in "
    +            + (double) (System.currentTimeMillis() - startTime) / 1000.0
    +            + " seconds");
    +      }
    +    }
    +  }
    + + +
    Index: graph/src/main/java/org/apache/hama/graph/VertexInterface.java =================================================================== --- graph/src/main/java/org/apache/hama/graph/VertexInterface.java (revision 1301308) +++ graph/src/main/java/org/apache/hama/graph/VertexInterface.java (working copy) @@ -25,23 +25,31 @@ public interface VertexInterface { - /** - * @return the vertex ID. - */ + /** @return the unique identification for the vertex. */ public String getVertexID(); - + /** @return the number of vertices in the input graph. */ + public long getNumVertices(); + /** The user-defined function */ public void compute(Iterator messages) throws IOException; - + /** @return a list of outgoing edges of this vertex in the input graph. */ public List getOutEdges(); - + /** Sends a message to another vertex. */ public void sendMessage(Edge e, MSGTYPE msg) throws IOException; - + /** Sends a message to neighbors */ public void sendMessageToNeighbors(MSGTYPE msg) throws IOException; - + /** @return the superstep number of the current superstep (starting from 0). */ public long getSuperstepCount(); - + /** + * Sets the vertex value + * + * @param value + */ public void setValue(MSGTYPE value); - + /** + * Gets the vertex value + * + * @return value + */ public MSGTYPE getValue(); }