Thanks for the insight Jonathan. That was my intuition as well, and I observed my cluster periodically marking nodes as down for a second or two. I figured it was random network hiccups, since our network hardware is rather old. It would make sense that these periodic interruptions caused the BMT to lose data.
While looking through the code, I did try to see if I could use BMT with the blocking MessagingService API (in the way the Thrift API works unless ConsistencyLevel.ZERO is specified), but it looks like BMT is hardcoded to be asynchronous. It might be nice for that option to be there, but since this issue appears to only affect me (and I no longer need to use BMT for my purposes), it's a super-low priority suggestion.