ZK guarantees that the txns will be flushed to disk in order, and we're doing batch flush to improve the disk IO efficiency and throughput, but when sending ACK back its still sending one by one, which is not efficient, instead we can send the ACK for the last flushed txn to leader in batch mode.
On leader, when it's receiving the ACK for txn N, based on the flushing order guarantees, all the txns before N have been flushed to disk as well, so they're all ACKed. The leader can then maintain the (SID -> last ACKed ZXID) map to calculate the latest COMMIT ZXID, and send that to all learners.
Based on the ordering guarantee, when learner received COMMIT for txn N, it means all the txns before that have been committed.
The main benefit we can get from this feature is to reduce the memory pressure, GC, quorum communication effort on all servers, and reduce the lock contention on leader when processing ACK, Commit, etc.
Overall, this will improve the efficiency of ZK, and expect to support higher throughput for write traffic.
To main challenge of this work is making sure backward compatible and also safe for gradually rollout, meanwhile make sure it won't affect the correctness/durability for txns during dynamic reconfig.