Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Implemented
-
6.1
-
None
-
None
Description
This ticket is to implement a distributed shortest path graph traversal as a Streaming Expression.
Expression syntax:
shortestPath(collection, from="john@company.com", to="jane@company.com", edge="from=to", threads="6", partitionSize="300", fq="limiting query", maxDepth="4")
The expression above performs a breadth first search to find the shortest paths in an unweighted, directed graph. The search starts from the node john@company.com and searches for the node jane@company.com, traversing the edges by iteratively joining the from and to columns. Each level in the traversal is implemented as a parallel partitioned nested loop join across the entire collection. The threads parameter controls the number of threads performing the join at each level. The partitionSize controls the of number of nodes in each join partition. maxDepth controls the number of levels to traverse. fq is a limiting query applied to each level in the traversal.
Future implementations can add more capabilities such as weighted traversals.