Details
-
Improvement
-
Status: Done
-
Major
-
Resolution: Done
-
None
-
None
-
None
Description
Currently, Storage has many components to implement, and most of them are very specific to HBase storage backend(ex: WriteWriteConflictResolver).
Even though it is possible to add new storage backend using current abstraction, Storage, there are too many works which is not common to all storage backend
I am suggesting following simple interface which is all common for any storage backend to be integrated with.
trait Fetcher { def init(config: Config)(implicit ec: ExecutionContext): Future[Fetcher] def fetches(queryRequests: Seq[QueryRequest], prevStepEdges: Map[VertexId, Seq[EdgeWithScore]])(implicit ec: ExecutionContext): Future[Seq[StepResult]] def close(): Unit }
trait Mutator { def mutateVertex(zkQuorum: String, vertex: S2VertexLike, withWait: Boolean)(implicit ec: ExecutionContext): Future[MutateResponse] def mutateStrongEdges(zkQuorum: String, edges: Seq[S2EdgeLike], withWait: Boolean)(implicit ec: ExecutionContext): Future[Seq[Boolean]] def mutateWeakEdges(zkQuorum: String, edges: Seq[S2EdgeLike], withWait: Boolean)(implicit ec: ExecutionContext): Future[Seq[(Int, Boolean)]] def incrementCounts(zkQuorum: String, edges: Seq[S2EdgeLike], withWait: Boolean)(implicit ec: ExecutionContext): Future[Seq[MutateResponse]] def updateDegree(zkQuorum: String, edge: S2EdgeLike, degreeVal: Long = 0)(implicit ec: ExecutionContext): Future[MutateResponse] def deleteAllFetchedEdgesAsyncOld(stepInnerResult: StepResult, requestTs: Long, retryNum: Int)(implicit ec: ExecutionContext): Future[Boolean] }
By abstracting query/mutation as above interface, it is possible to implement JDBCFetcher and JDBCMutator that read/write vertex and edge into any JDBC enabled storage only implementing above interfaces.
One thing to discuss is how we are going to maintain the information about what ServiceColumn/Label use which storage implementation.
The naive solution would be store configuration for storage backend into ServiceColumn/Label's options field, which accepts JSON, and make S2Graph instance to maintain the mapping of what ServiceColumn/Label use which storage implementation.
I think above abstraction make it possible to use different implementation per each ServiceColumn/Label, and more importantly, the user can provide their own implementation.
For example, storing `User` vertex into Postgresql and `Friends` edges into HBase can be possible. Also, users who do not want to use S2Graph for vertex, do not need to store vertex at all, but by implementing `Fetcher` interface they can still traverse vertex as they are stored in S2Graph.
Come up with this suggestion while working on S2GRAPH-206, since model serving requires different implementation for `Fetcher` per model.
Attachments
Issue Links
- links to