[TINKERPOP-1131] TraversalVertexProgram traverser management is inefficient memory-wise. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.1-incubating
Fix Version/s: 3.2.0-incubating, 3.1.2-incubating
Component/s: process
Labels:
None

Description

The traversers incoming to a vertex at an iteration are in a TraverserSet. We iterate that set and attach the traversers to their respective local object (e.g. vertex, edge, property, etc.). This creates a toProcess TraverserSet. At this point, we have 2 sets the same size! We NEVER clear the message set and process the toProcess traversers to create an aliveTraversers set. Now, 3 sets! If you have millions of edges on an outE() you have 3 million entry sets (nasty!). We then set toProcess to aliveTraversers and keep doing this until the set is completely empty. (they empty when a traverser needs to go to another vertex to keep processing – a message pass).

So, to preserve memory we need to "drain" the TraverserSets. That is, iterate and remove() so that we don't create set clones and blow heap and cause (e.g.) SparkGraphComputer to spill memory to disk.

Attachments

Activity

People

Assignee:: Marko A. Rodriguez

Reporter:: Marko A. Rodriguez

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Feb/16 18:45

Updated:: 09/Feb/16 17:16

Resolved:: 09/Feb/16 17:16