Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.1.0-incubating
Description
Currently, when you do:
graph.compute().program(PageRankVertexProgram).submit()
We are pulling the entire graph into the OLAP engine. We should allow the user to limit the amount of data pulled via "vertex query"-type filter. For instance, we could support the following two new methods on GraphComputer.
graph.compute().program(PageRankVertexProgram).vertices(hasLabel('person')).edges(out, hasLabel('knows','friend').has('weight',gt(0.8)).submit()
The two methods would be defined as:
public interface GraphComputer { ... GraphComputer vertices(final Traversal<Vertex,Vertex> vertexFilter) GraphComputer edges(final Direction direction, final Traversal<Edge,Edge> edgeFilter)
If the user does NOT provide a vertices() (or edges()) call, then the Traversal is assumed to be IdentityTraversal. Finally, in terms of execution order, first vertices() is called and if "false" then don't call edges. Else, call edges on all the respective incoming and outgoing edges. Don't really like Direction there and perhaps its just:
GraphComputer edges(final Traversal<Vertex,Edge> edgeFilter)
And then all edges that pass through are added to OLAP vertex. You don't want both? Then its outE('knows',friend').has('weight',gt(0.8)).