-
Type:
Bug
-
Status: Resolved
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: 2.2.0
-
Fix Version/s: 2.3.0
-
Component/s: Structured Streaming
-
Labels:None
In stream aggregation, last('attr) yields the last value from the first microbatch forever. I'm not sure if it's fair to call this a correctness bug, since last doesn't have strong correctness semantics, but ignoring all rows past the first microbatch is at least weird.
Simple repro in StreamingAggregationSuite:
val input = MemoryStream[Int]
val aggregated = input.toDF().agg(last('value))
testStream(aggregated, OutputMode.Complete())(
AddData(input, 1, 2, 3),
CheckLastBatch(3),
AddData(input, 4, 5, 6),
CheckLastBatch(6) // actually yields 3 again