Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.2.0
-
None
Description
In stream aggregation, last('attr) yields the last value from the first microbatch forever. I'm not sure if it's fair to call this a correctness bug, since last doesn't have strong correctness semantics, but ignoring all rows past the first microbatch is at least weird.
Simple repro in StreamingAggregationSuite:
val input = MemoryStream[Int]
val aggregated = input.toDF().agg(last('value))
testStream(aggregated, OutputMode.Complete())(
AddData(input, 1, 2, 3),
CheckLastBatch(3),
AddData(input, 4, 5, 6),
CheckLastBatch(6) // actually yields 3 again