Description
Currently a graph is created only be adding vertices. The typical way is to read input text files line-by-line with each line describing a vertex (its value, its edges etc). The current API allows for the creation of a vertex only if all the information for the vertex is available in a single line.
However, it's common to have graphs described in the form of edges. Edges might span multiple lines in an input file or even span multiple workers. The current API doesn't allow this. In the input superstep, a vertex must be created by a single worker.
Instead, it should be possible for multiple workers to mutate the graph during the input superstep.
This has the following implications:
1) Instead of just instantiating a vertex, a vertex reader should be able to do vertex addition and edge addition requests.
2) Multiple workers might try to create the same vertex. Any conflicts should be handled with a VertexResolver. So the resolver has to be instantiated before load time.
Attachments
Attachments
Issue Links
- contains
-
GIRAPH-393 Number of input split threads should always be >= 1
- Resolved
- is related to
-
GIRAPH-170 Workflow for loading RDF graph data into Giraph
- Resolved
-
GIRAPH-141 multigraph support in giraph
- Resolved