we have delete bloom filters to avoid seeks to the beginning on the row to check for family delete markers. Those would no longer work, or in other words we'd need boom filters for the undelete markers.
Hmm, I wasn't familiar with these, but, yes, sounds like we would unfortunately need another set of bloom filters for the undeletes. There would also need to be some change to the use of the delete bloom filters if operations were ordered by seqid, wouldn't there?
will we every need so to undo an undelete?
Yes, I thought of this as well. What undoes the undeletes? In a sense this shifts the irreversible operation to the undelete. I think this is an inherent problem with the approach of sorting operations within the same timestamp by operation type. There is always going to be something that sorts first, which effectively becomes undoable.
how does this fit into the discussion about using sequence numbers to order operations?
I think the discussion to order operations by seqid is simply an alternate approach to the some of the same underlying problems. For the issue mentioned above, sorting by seqid is conceptually simpler – there is no need for a separate undelete operation and undoing a delete is possible by re-issuing a new put for the previous value at the same timestamp as the previous delete. Of course that means that the final outcome depends on the server observed ordering of the operations. But at the same time, there is no "irreversible" operation.
For the use case I'm considering (a single client rolling back it's previously persisted changes at a given timestamp), ordering by seqid would also work. Since it's a single client rolling back it's own operations, the operations can be ordered by the client, so there's no lack of determinism in server side ordering. Rolling back a delete with seqid ordering would be slightly more complicated, though, since the client would have to perform a read to find the previous value prior to the delete, then issue a new put, instead of simply issuing an undelete with the same parameters as the prior delete.
Or you could combine both approaches and order operations by seqid, but add an undelete operation as well. The undelete would then sort prior to the delete by virtue of being issued after it. It would still be a no-read operation. And the undelete could be undone by issuing a new delete. Maybe this combination ultimately winds up being best.
Thanks for the comments. I'm open to however we can most efficiently solve this with the least amount of added complexity.