Nick Wolf reported an error he hit on his cluster where transactions were assigned out-of-order timestamps. After looking at his WALs for a while, I realized we have this bug:
- the clock gets an update from another node a second in the future
- we generate a lot of timestamps locally, which causes the clock to increment its logical portion
- the logical portion exceeds the number of bits allocated to it (12 bits iirc). The way we return timestamps, this is OK – we simply add the logical and physical portions, so it "wraps" into the physical portion, causing it to increment forward a few microseconds as we do thousands of writes.
- however, our host clock eventually reaches a higher value than the last physical portion. This causes us to reset our logical clock portion back to 1, which may actually rewind it back in time.