Uploaded image for project: 'S2Graph'
  1. S2Graph
  2. S2GRAPH-213

Abstract Query/Mutation from Storage.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Done
    • Major
    • Resolution: Done
    • None
    • None
    • s2core
    • None

    Description

      Currently, Storage has many components to implement, and most of them are very specific to HBase storage backend(ex: WriteWriteConflictResolver).

      Even though it is possible to add new storage backend using current abstraction, Storage, there are too many works which is not common to all storage backend

      I am suggesting following simple interface which is all common for any storage backend to be integrated with.

      trait Fetcher {
      
        def init(config: Config)(implicit ec: ExecutionContext): Future[Fetcher] 
        
        def fetches(queryRequests: Seq[QueryRequest],
                    prevStepEdges: Map[VertexId, Seq[EdgeWithScore]])(implicit ec: ExecutionContext): Future[Seq[StepResult]]
      
        def close(): Unit
      }
      
      trait Mutator {
        def mutateVertex(zkQuorum: String, 
                         vertex: S2VertexLike, 
                         withWait: Boolean)(implicit ec: ExecutionContext): Future[MutateResponse]
      
      def mutateStrongEdges(zkQuorum: String, 
                            edges: Seq[S2EdgeLike], 
                            withWait: Boolean)(implicit ec: ExecutionContext): Future[Seq[Boolean]]
      
      def mutateWeakEdges(zkQuorum: String, 
                          edges: Seq[S2EdgeLike], 
                          withWait: Boolean)(implicit ec: ExecutionContext): Future[Seq[(Int, Boolean)]]
      
      def incrementCounts(zkQuorum: String, 
                          edges: Seq[S2EdgeLike], 
                          withWait: Boolean)(implicit ec: ExecutionContext): Future[Seq[MutateResponse]]
      
      def updateDegree(zkQuorum: String, 
                       edge: S2EdgeLike, 
                       degreeVal: Long = 0)(implicit ec: ExecutionContext): Future[MutateResponse]
      
      def deleteAllFetchedEdgesAsyncOld(stepInnerResult: StepResult,
                                        requestTs: Long,
                                        retryNum: Int)(implicit ec: ExecutionContext): Future[Boolean]
      }
      

      By abstracting query/mutation as above interface, it is possible to implement JDBCFetcher and JDBCMutator that read/write vertex and edge into any JDBC enabled storage only implementing above interfaces.

      One thing to discuss is how we are going to maintain the information about what ServiceColumn/Label use which storage implementation.

      The naive solution would be store configuration for storage backend into ServiceColumn/Label's options field, which accepts JSON, and make S2Graph instance to maintain the mapping of what ServiceColumn/Label use which storage implementation.

      I think above abstraction make it possible to use different implementation per each ServiceColumn/Label, and more importantly, the user can provide their own implementation.

      For example, storing `User` vertex into Postgresql and `Friends` edges into HBase can be possible. Also, users who do not want to use S2Graph for vertex, do not need to store vertex at all, but by implementing `Fetcher` interface they can still traverse vertex as they are stored in S2Graph.

      Come up with this suggestion while working on S2GRAPH-206, since model serving requires different implementation for `Fetcher` per model.

      Attachments

        Issue Links

          Activity

            People

              steamshon Do Yung Yoon
              steamshon Do Yung Yoon
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 336h
                  336h
                  Remaining:
                  Remaining Estimate - 336h
                  336h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified