Currently AsynchbaseStorage that implement Storage trait using HBase use squash optimization on snapshot edges at mutateEdges.
for example, if there are requests on same snapshotEdge that consists of (insert, delete, insert, delete, insert) in their timestamp order, then we only need to apply last insert.
ex) lets assume that there are 5 requests on from “shon” to “dun” with label “friend”, insert(t0), delete(t1), insert(t2), delete(t3), insert(t4).
without squashing same snapshot edges in memory, then 5 of following actions need to be done.
- fetch snapshot edge
- lock this snapshot edge.
- build new update and delete, insert, degree on IndexEdges.
- fire above update/delete/insert/degree mutations into HBase.
- release lock on this snapshotEdge.
we can do 1, 2, 5(fetch, lock, release lock) one time and squash mutations that built from multiple requests and squash them.
above logic needs to keep consistency between multiple indexEdges and purpose above logic is make only one thread on same snapshot edge be able to mutate(note the lock). since we will acquire lock on this edge, it is much efficient squash multiple requests on same snapshotEdge to avoid heavy operation, fetch, lock, release lock.