[TINKERPOP-1009] Add a CAUTION to documentation about HadoopGraph and getting back elements - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.0-incubating
Fix Version/s: 3.1.1-incubating
Component/s: documentation, hadoop
Labels:
None

Description

This works, but its crazy to do for large data over non-random access sources.

// g is a SparkGraphComputer traversal
gremlin> g.V().out().out()
==>v[3]
==>v[5]
gremlin>

Why is this crazy? Cause for each vertex, there is a graph.vertices(id) lookup which, for HadoopGraph is a linear scan of the input format. This is nutz for massive graphs.

gremlin> g.V().out().out().toList().get(0).getClass()
==>class org.apache.tinkerpop.gremlin.hadoop.structure.HadoopVertex

In our docs, we should state that you should use HadoopGraph to generate reductions and not just swathes of vertices. Or, if you need a vertex, don't get the vertex, get ONLY its ID.

gremlin>  g.V().out().out().id()
==>3
==>5

Finally, note that in TraversalVertexProgram we have a configuration that we never exposed to the user but we should via gremlin.traversalVertexProgram.attachElements.

https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/step/map/ComputerResultStep.java#L56

https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/step/map/ComputerResultStep.java#L87-L90

As we have it now attachElements is always TRUE.

Attachments

Activity

People

Assignee:: Marko A. Rodriguez

Reporter:: Marko A. Rodriguez

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Nov/15 20:07

Updated:: 12/Jan/16 21:11

Resolved:: 12/Jan/16 21:11