Affects Version/s: None
Fix Version/s: None
A session window is a collection of rows whose key values, when sorted, have a gap of at most N.
Q1. Should "at most" be "less than"?
The key type can be any type that has a minus operator, that is, numeric and date-time.
I propose the following syntax: session(key [, ...]*, interval). For example:
to find bursts of orders for the same product where consecutive orders are no more than 5 seconds apart.
The first key column rowtime defines the session and must be of numeric/date-time type, and must have monotonicity or similar in order for the query to make progress; the other key columns (in this case productId) can be of any type; the last column is the interval, and must be constant.
The session function returns the key value at the start of the window. Unlike the hop function, each row belongs to precisely one window. But session is not a true function, because its value depends on the records flowing in the stream.
Q2. If session is used, should we allow order-dependent aggregate functions such as first_value?
Q3. Should we allow session as a windowed aggregate function?