ShortReadRowProtection.moreContents() expects that by the time we get to that method, the global post-reconciliation counter was already applied to the current partition. However, sometimes it won’t get applied, and the global counter continues counting with rowInCurrentPartition value not reset from previous partition, which in the most obvious case would trigger the assertion we are observing - assert !postReconciliationCounter.isDoneForPartition();. In other cases it’s possible because of this lack of reset to query a node for too few extra rows, causing unnecessary SRP data requests.
Why is the counter not always applied to the current partition?
The merged PartitionIterator returned from DataResolver.resolve() has two transformations applied to it, in the following order:
Filter - to purge non-live data from partitions, and to discard empty partitions altogether (except for Thrift)
Counter, to count and stop iteration
Problem is, Filter ’s applyToPartition() code that discards empty partitions (closeIfEmpty() method) would sometimes consume the iterator, triggering short read protection before Counter ’s applyToPartition() gets called and resets its rowInCurrentPartition sub-counter.
We should not be consuming iterators until all transformations are applied to them. For transformations it means that they cannot consume iterators unless they are the last transformation on the stack.
The linked branch fixes the problem by splitting Filter into two transformations. The original - Filter - that does filtering within partitions - and a separate EmptyPartitionsDiscarder, that discards empty partitions from PartitionIterators. Thus DataResolve.resolve(), when constructing its PartitionIterator, now does merge first, then applies Filter, then Counter, and only then, as its last (third) transformation - the EmptyPartitionsDiscarder. Being the last one applied, it’s legal for it to consume the iterator, and triggering moreContents() is now no longer a problem.