Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13911

IllegalStateException thrown by UPI.Serializer.hasNext() for some SELECT queries

    XMLWordPrintableJSON

Details

    • Normal

    Description

      Certain combinations of rows, in presence of per partition limit (set explicitly in 3.6+ or implicitly to 1 via DISTINCT) cause UnfilteredPartitionIterators.Serializer.hasNext() to throw IllegalStateException .

      Relevant code snippet:

      // We can't answer this until the previously returned iterator has been fully consumed,
      // so complain if that's not the case.
      if (next != null && next.hasNext())
          throw new IllegalStateException("Cannot call hasNext() until the previous iterator has been fully consumed");
      

      Since UnfilteredPartitionIterators.Serializer and UnfilteredRowIteratorSerializer.serializer deserialize partitions/rows lazily, it is required for correct operation of the partition iterator to have the previous partition fully consumed, so that deserializing the next one can start from the correct position in the byte buffer. However, that condition won’t always be satisfied, as there are legitimate combinations of rows that do not consume every row in every partition.

      For example, look at this dtest.

      In case we end up with a following pattern of rows:

      node1, partition 0 | 0
      node2, partition 0 |   x x
      

      , where x and x a row tombstones for rows 1 and 2, it’s sufficient for MergeIterator to only look at row 0 in partition from node1 and at row tombstone 1 from node2 to satisfy the per partition limit of 1. The stopping merge result counter will stop iteration right there, leaving row tombstone 2 from node2 unvisited and not deseiralized. Switching to the next partition will in turn trigger the IllegalStateException because we aren’t done yet.

      The stopping counter is behaving correctly, so is the MergeIterator. I’ll note that simply removing that condition is not enough to fix the problem properly - it’d just cause us to deseiralize garbage, trying to deserialize a new partition from a position in the bytebuffer that precedes remaining rows in the previous partition.

      Attachments

        Activity

          People

            aleksey Aleksey Yeschenko
            aleksey Aleksey Yeschenko
            Aleksey Yeschenko
            Sam Tunnicliffe
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: