Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.1.0
-
None
-
None
Description
Currently the vertex initialize() method is passed the complete adjacency list as a HashMap. All the current concrete implementations of Vertex iterate over the adjacency list and recreate new Data Structures within the Vertex instance to hold/manipulate the adjacency list. This would seize to be feasible once the size of the adjacency list becomes really huge.
I propose storing the adjacency list and all vertex information (and incoming messages ?) in a distributed data store such as HBase. The adjacency list can be lazily loaded via HBase Scans. I was thinking of an HBase schema where the row Id is a concatenation of VertexID+OutboundVertexId with a single column containing the edge.
Attachments
Issue Links
- is related to
-
GIRAPH-153 HBase/Accumulo Input and Output formats
- Resolved
- relates to
-
GIRAPH-94 Loading vertex ranges from HBase
- Resolved