Let's assume we query from partition A and B, and we see the results don't match timestamps, we would pull the latest batchlog assuming they are from the same batch but let's say they in fact are not. In this case we wasted a lot of time so my question is should we only do this in the user supplies a new CL type?
If you set the same, unique (e.g., UUID) write timestamp for all writes in a batch, then you know that any results with different timestamps are part of different batches. So, given mismatched timestamps, should you check the batchlog for pending writes? One solution is to always check (as in RAMP-Small). This doesn't require any extra metadata, but, as you point out, also requires 2 RTTs. To cut down on these RTTs, you could also do attach a Bloom filter of the items in each batch and only check any possibly missing writes (as in RAMP-Hybrid). (I can go into more detail if you want.) However, I agree that you might not want to pay these costs all of the time for reads. Would a BATCH_READ or other modifier to CQL SELECT statements make sense?
In the case of a global index we plan on reading the data after reading the index. The data query might reveal the indexed value is stale. We would need to apply the batchlog and fix the index, would we then restart the entire query? or maybe overquery assuming some index values will be stale? Either way this query looks different than the above scenario.
I think there are a few options. The easiest is to simply filter out the out of date rows, and then you are guaranteed to see a subset of the index entries. Alternatively, you could provide a "snapshot index read" where you read the older, overwritten values from the data node. If you want a "read latest and read snapshot" mode, there are some options I can describe, but they generally entail either more metadata or, otherwise, using locks/blocking coordination, which I don't think you want.