Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1131

TraversalVertexProgram traverser management is inefficient memory-wise.

    XMLWordPrintableJSON

Details

    Description

      The traversers incoming to a vertex at an iteration are in a TraverserSet. We iterate that set and attach the traversers to their respective local object (e.g. vertex, edge, property, etc.). This creates a toProcess TraverserSet. At this point, we have 2 sets the same size! We NEVER clear the message set and process the toProcess traversers to create an aliveTraversers set. Now, 3 sets! If you have millions of edges on an outE() you have 3 million entry sets (nasty!). We then set toProcess to aliveTraversers and keep doing this until the set is completely empty. (they empty when a traverser needs to go to another vertex to keep processing – a message pass).

      So, to preserve memory we need to "drain" the TraverserSets. That is, iterate and remove() so that we don't create set clones and blow heap and cause (e.g.) SparkGraphComputer to spill memory to disk.

      Attachments

        Activity

          People

            okram Marko A. Rodriguez
            okram Marko A. Rodriguez
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: