Description
I think that we should add a second type of aggregators that will be able to act on top of BSP peers instead of vertices. This will give the advantage to graph framework programmers and users to add features easier.
The interface that I was thinking:
// This will run on slave peers
// Get the value of the peer we want to monitor and act on it
public Writable getPeerValue(GraphJobRunner<V, E, M> graphJobRunner);
// This will run on master aggregator
// Compose values
public void aggregate(Writable v);
// Get the composed value
public Writable getValue();
// This will run on slave peers
// Set the composed value to the peer
public void setPeer(GraphJobRunner<V, E, M> graphJobRunner, Writable v);
// Reset the instance so we don't need to recreate object
public void reset();
e.g.
1) The graph framework "stop" condition can act as a peer aggregator
a) we check on every peer if we have vertex changes (an int variable)
b) send all integers to the master aggregator and check if we need to stop
c) slaves get a message to stop the peer or continue
2) Count all vertices in every superstep (if we have dynamic changes in vertices) can act as a peer aggregator
a) get the number of vertices on every peer
b) sum all vertices count on master aggregator
c) slaves update their values
For aggregators to work properly, we need to make a sync (superstep) before every graph iteration. But we can also introduce a "lazy" way to do the aggregations so the user code run at same time of the aggregation. (the "stop" condition works this way).