Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.13.1
Description
I tried to reimplement the Hourly Tips exercise from the DataStream training using Flink SQL. The objective of this exercise is to find the one taxi driver who earned the most in tips during each hour, and report that driver's driverId and the sum of their tips.
This can be expressed as a window top-n query, where n=1, as in
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY sumOfTips DESC) as rownum
FROM (
SELECT driverId, window_start, window_end, sum(tip) as sumOfTips
FROM TABLE(
TUMBLE(TABLE fares, DESCRIPTOR(startTime), INTERVAL '1' HOUR))
GROUP BY driverId, window_start, window_end
)
) WHERE rownum = 1;
This fails because the WindowRankOperatorBuilder insists on {{rankEnd > 1. }}So, in other words, while it is possible to report the top 2 drivers, or the driver in 2nd place, it's not possible to report only the top driver.
This appears to be an off-by-one error in the range checking.
Attachments
Issue Links
- links to