In this case, when a new LedgerStorage implementation comes in, it should again re-define the checkpointing algo. IMHO, instead of this can we think of an approach where we can decouple the checkpointing algo from Interleaved storage. Bookie can own this checkpointing logic and let him control. With this approach Bookie will have more control over the checkpointing irrespective of the plugged-in ledger storage. How does it sound?. Sijie Guo, are you also thinking in similar way?
If we want the LedgerStorage to control when checkpointing should occur, then LedgerStorage has to run the checkpoint itself. Otherwise you have coupled the LedgerStorage to the Bookie.SyncThread. There's no problem with breaking the sync thread out into a separate class, which multiple LedgerStorage implementations can use, but it should be owned by the LedgerStorage
1) you moved LogMark to ledger storage, which makes journal contructor "Journal(conf, logmark)" behavior unclear,
This should be better. The journal should just be constructed with Journal(conf). LastSyncedLogMark should only come into play for Journal#replay(JournalScanner) which should become Journal#replay(LogMark from, JournalScanner).
sync thread (checkpointing) logic should be maintained by Bookie itself
I strongly disagree with this because...
as the sync(checkpointing) logic is part of bookie not ledger storage
...all the logic to do the checkpoint is in the LedgerStorage. The decision to make the checkpoint is taken from within the ledger storage. So this is false. The logic is part of ledger storage.
it should be common across different ledger storage implementations.
It can be broken out into a different class which can be shared by different implementations. It should be owned by the ledger storage though.
1), making LogMark as a part to journal would make Journal clearer on the replaying behaviour.
The log mark is dependent on the ledger storage and only means anything in the context of the ledger storage. It should only be stored when a checkpoint has occurred. This means that the ledger storage is what decides which log mark to store. If the journal is storing the mark, the ledger storage is triggering behaviour on the journal. Again, this is another piece that could be broken out into a separate class to be used by multiple ledger storage implementations, but it should remain owned by the ledger storage.
To reiterate, this changes need to be done to make it possible to benchmark the ledger storage in a way that the ledger storage will behave the same as it does when running under a bookie.