Description
Currently we double-write any DB-backed stores into a Snapshot struct when creating a Snapshot. This inflates the size of the Snapshot, which is already a problem for large production clusters (see AURORA-74).
Example for LockStore from https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java:
new SnapshotField() { // It's important for locks to be replayed first, since there are relations that expect // references to be valid on insertion. @Override public void saveToSnapshot(MutableStoreProvider store, Snapshot snapshot) { snapshot.setLocks(ILock.toBuildersSet(store.getLockStore().fetchLocks())); } @Override public void restoreFromSnapshot(MutableStoreProvider store, Snapshot snapshot) { if (hasDbSnapshot(snapshot)) { LOG.info("Deferring lock restore to dbsnapshot"); return; } store.getLockStore().deleteLocks(); if (snapshot.isSetLocks()) { for (Lock lock : snapshot.getLocks()) { store.getLockStore().saveLock(ILock.build(lock)); } } } },
The saveToSnapshot here is totally redundant as the entire H2 database is dumped into the dbScript field.
Note: one major side-effect here is if anyone is trying to read these snapshots and utilize the data outside of Java - they'll lose the ability to process the data without being able to apply the DB script.
Attachments
Attachments
Issue Links
- relates to
-
AURORA-1870 Add finer grained timings to the Snapshot process
- Resolved