Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13747

Fix AssertionError in short read protection

    XMLWordPrintableJSON

    Details

    • Severity:
      Normal

      Description

      ShortReadRowProtection.moreContents() expects that by the time we get to that method, the global post-reconciliation counter was already applied to the current partition. However, sometimes it won’t get applied, and the global counter continues counting with rowInCurrentPartition value not reset from previous partition, which in the most obvious case would trigger the assertion we are observing - assert !postReconciliationCounter.isDoneForPartition();. In other cases it’s possible because of this lack of reset to query a node for too few extra rows, causing unnecessary SRP data requests.

      Why is the counter not always applied to the current partition?

      The merged PartitionIterator returned from DataResolver.resolve() has two transformations applied to it, in the following order:
      Filter - to purge non-live data from partitions, and to discard empty partitions altogether (except for Thrift)
      Counter, to count and stop iteration

      Problem is, Filter ’s applyToPartition() code that discards empty partitions (closeIfEmpty() method) would sometimes consume the iterator, triggering short read protection before Counter ’s applyToPartition() gets called and resets its rowInCurrentPartition sub-counter.

      We should not be consuming iterators until all transformations are applied to them. For transformations it means that they cannot consume iterators unless they are the last transformation on the stack.

      The linked branch fixes the problem by splitting Filter into two transformations. The original - Filter - that does filtering within partitions - and a separate EmptyPartitionsDiscarder, that discards empty partitions from PartitionIterators. Thus DataResolve.resolve(), when constructing its PartitionIterator, now does merge first, then applies Filter, then Counter, and only then, as its last (third) transformation - the EmptyPartitionsDiscarder. Being the last one applied, it’s legal for it to consume the iterator, and triggering moreContents() is now no longer a problem.

      Fixes: 3.0, 3.11, 4.0. dtest here.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                aleksey Aleksey Yeschenko
                Reporter:
                aleksey Aleksey Yeschenko
                Authors:
                Aleksey Yeschenko
                Reviewers:
                Benedict Elliott Smith
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: