Index: src/site/site.xml =================================================================== --- src/site/site.xml (revision 1359451) +++ src/site/site.xml (working copy) @@ -27,16 +27,16 @@ org.apache.maven.skins - maven-default-skin - 1.0 + maven-fluido-skin + 1.2.3-SNAPSHOT - + - - + + @@ -62,8 +62,8 @@ - - + + @@ -72,8 +72,8 @@ - - + + Index: src/site/resources/css/site.css =================================================================== --- src/site/resources/css/site.css (revision 1359451) +++ src/site/resources/css/site.css (working copy) @@ -36,7 +36,7 @@ h5 { font-size: 15px; - margin-top: -0.8em; + margin-top: -0.1em; } table { @@ -56,6 +56,12 @@ background: #edf7f4; background-color: #edf7f4; } + +pre { + width: 95%; + margin: 20px 20px 20px 20px; +} + .green a:link, .green a:active, .green a:visited { color: #0a4d39; } .green a:hover { color: #888800; } .green h3 { @@ -90,3 +96,15 @@ padding-left: 1em; list-style-position: inside; } + +#leftColumn li.none{ + text-indent:-1em; + margin-left:-1em; +} + +#leftColumn h3 { + font-size: 16px; + margin-left:-0.7em; +} + +body.topBarDisabled{padding-top:0px;} \ No newline at end of file Index: src/site/xdoc/downloads.xml =================================================================== --- src/site/xdoc/downloads.xml (revision 1359451) +++ src/site/xdoc/downloads.xml (working copy) @@ -42,10 +42,10 @@ which is located within each download directory.

Always use the signature files to verify the authenticity of the distribution, e.g., -
  % pgpk -a KEYS
+    
  % pgpk -a KEYS
   % pgpv hama-x.x.x.tar.gz.asc
or, -
  % gpg --import KEYS
+    
  % gpg --import KEYS
   % gpg --verify hama-x.x.x.tar.gz.asc
We offer MD5 hashes as an alternative to validate the integrity of the downloaded files. A unix program called md5 or md5sum is included in many unix distributions. Index: src/site/xdoc/getting_started_with_hama.xml =================================================================== --- src/site/xdoc/getting_started_with_hama.xml (revision 1359451) +++ src/site/xdoc/getting_started_with_hama.xml (working copy) @@ -62,7 +62,7 @@
  • BSPMaster and Zookeeper settings - Figure out where to run your HDFS namenode and BSPMaster. Set the variable bsp.master.address to the BSPMaster's intended host:port. Set the variable fs.default.name to the HDFS Namenode's intended host:port.

Here's an example of a hama-site.xml file:

-
+
   <?xml version="1.0"?>
   <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
   <configuration>
@@ -99,7 +99,7 @@
 
 

If you are managing your own ZooKeeper, you have to specify the port number as below:

-
+
   <property>
     <name>hama.zookeeper.property.clientPort</name>
     <value>2181</value>
@@ -111,16 +111,16 @@
     

NOTE: Skip this step if you're in Local Mode.
Run the command:

-
+  
   % $HAMA_HOME/bin/start-bspd.sh

This will startup a BSPMaster, GroomServers and Zookeeper on your machine.

Run the command:

-
+  
   % $HAMA_HOME/bin/stop-bspd.sh

to stop all the daemons running on your cluster.

-
+  
   % $HAMA_HOME/bin/hama jar hama-examples-x.x.x.jar [args]
Index: src/site/xdoc/hama_bsp_tutorial.xml =================================================================== --- src/site/xdoc/hama_bsp_tutorial.xml (revision 1359451) +++ src/site/xdoc/hama_bsp_tutorial.xml (working copy) @@ -40,7 +40,7 @@
The extending class must override the bsp() method, which is declared like this:

-
+    
   public abstract void bsp(BSPPeer<K1, V1, K2, V2, M extends Writable> peer) throws IOException, 
     SyncException, InterruptedException;
@@ -56,7 +56,7 @@

After your own BSP is created, you will need to configure a BSPJob and submit it to Hama cluster to execute a job. The BSP job configuration and submission interfaces is almost the same as the MapReduce job configuration:

-
+    
   HamaConfiguration conf = new HamaConfiguration();
   BSPJob job = new BSPJob(conf, MyBSP.class);
   job.setJobName("My BSP program");
@@ -68,9 +68,9 @@
     

See the below section for more detailed description of BSP user interfaces.

-
Inputs and Outputs
+

Inputs and Outputs

When setting up a BSPJob, you can provide a Input/OutputFormat and Paths like this:

-
+    
   job.setInputPath(new Path("/tmp/sequence.dat");
   job.setInputFormat(org.apache.hama.bsp.SequenceFileInputFormat.class);
   or,
@@ -87,7 +87,7 @@
     

Then, you can read the input and write the output from the methods in BSP class which has "BSPPeer" which contains an communication, counters, and IO interfaces as parameter. In this case we read a normal text file:

-
+    
  @Override
   public final void bsp(
       BSPPeer<LongWritable, Text, Text, LongWritable, Text> peer)
@@ -108,7 +108,7 @@
     There is also a function which allows you to re-read the input from the beginning.
     This snippet reads the input five times:
     

-
+    
   for(int i = 0; i < 5; i++){
     LongWritable key = new LongWritable();
     Text value = new Text();
@@ -119,7 +119,7 @@
     peer.reopenInput()
   }
-
Communication
+

Communication

Hama BSP provides simple but powerful communication APIs for many purposes. We tried to follow the standard library of BSP world as much as possible. The following table describes all the methods you can use:

@@ -137,7 +137,7 @@

The send() and all the other functions are very flexible. Here is an example that sends a message to all peers:

-
+    
   @Override
   public void bsp(
       BSPPeer<NullWritable, NullWritable, Text, DoubleWritable, Text> peer)
@@ -150,7 +150,7 @@
     peer.sync();
   }
-
Synchronization
+

Synchronization

When all the processes have entered the barrier via the sync() method, the Hama proceeds to the next superstep. @@ -162,7 +162,7 @@ For example, the sync() method also can be called in a for loop so that you can use to program the iterative methods sequentially:

-
+    
   @Override
   public void bsp(
       BSPPeer<NullWritable, NullWritable, Text, DoubleWritable, Text> peer)
@@ -191,7 +191,7 @@
 
     
     

Here is an BSP-based Pi Calculation example and submit it to Hama cluster:

-
+    
   private static Path TMP_OUTPUT = new Path("/tmp/pi-" + System.currentTimeMillis());
 
   public static class MyEstimator extends
Index: src/site/xdoc/hama_graph_tutorial.xml
===================================================================
--- src/site/xdoc/hama_graph_tutorial.xml	(revision 1359451)
+++ src/site/xdoc/hama_graph_tutorial.xml	(working copy)
@@ -30,7 +30,7 @@
         
 
     

Writing a Hama graph application involves subclassing the predefined Vertex class. Its template arguments define three value types, associated with vertices, edges, and messages.

-
+    
   public abstract class Vertex<V extends Writable, E extends Writable, M extends Writable>
       implements VertexInterface<V, E, M> {
 
@@ -48,7 +48,7 @@
 From Superstep 1 to 30, each vertex sums up the values arriving on all its messages and sets its tentative page rank to (1 - 0.85) / numOfVertices + (0.85 * sum).
    

-
+    
   public static class PageRankVertex extends
       Vertex<Text, NullWritable, DoubleWritable> {
 
Index: src/site/xdoc/hama_on_clouds.xml
===================================================================
--- src/site/xdoc/hama_on_clouds.xml	(revision 1359451)
+++ src/site/xdoc/hama_on_clouds.xml	(working copy)
@@ -27,7 +27,7 @@
     
     
     

The following commands install Whirr and start a 5 node Hama cluster on Amazon EC2 in 5 minutes or less. -

+    
   % curl -O http://www.apache.org/dist/whirr/whirr-0.x.0/whirr-0.x.0.tar.gz
   % tar zxf whirr-0.x.0.tar.gz; cd whirr-0.x.0
 
@@ -38,7 +38,7 @@
   % bin/whirr launch-cluster --config recipes/hama-ec2.properties --private-key-file ~/.ssh/id_rsa_whirr

-
+    
   % cd /usr/local/hama-0.x.0
   % bin/hama jar hama-examples-x.x.x.jar [args]
Index: src/site/xdoc/index.xml =================================================================== --- src/site/xdoc/index.xml (revision 1359451) +++ src/site/xdoc/index.xml (working copy) @@ -26,8 +26,8 @@

Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms.

- -
+
+

Recent News

  • June 31, 2012: release 0.5.0 available [downloads]
  • @@ -37,7 +37,7 @@
  • June 2, 2011: release 0.2.0 available
  • Apr 30, 2010: Introduced in the BSP Worldwide
-
+

Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are: