Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Suggested Outline:
Windows (0) Outline: The anatomy of a window operation stream [.keyBy(...)] <- keyed versus non-keyed windows .window(...) <- required: "assigner" [.trigger(...)] <- optional: "trigger" (else default trigger) [.evictor(...)] <- optional: "evictor" (else no evictor) [.allowedLateness()] <- optional, else zero .reduce/fold/apply() <- required: "function" (1) Types of windows - tumble - slide - session - global (2) Pre-defined windows timeWindow() (tumble, slide) countWindow() (tumble, slide) - mention that count windows are inherently resource leaky unless limited key space (3) Window Functions - apply: most basic, iterates over elements in window - aggregating: reduce and fold, can be used with "apply()" which will get one element - forward reference to state size section (4) Advanced Windows - assigner - simple - merging - trigger - registering timers (processing time, event time) - state in triggers - life cycle of a window - create - state - cleanup - when is window contents purged - when is state dropped - when is metadata (like merging set) dropped (5) Late data - picture - fire vs fire_and_purge: late accumulates vs late resurrects (cf discarding mode) (6) Evictors - TDB (7) State size: How large will the state be? Basic rule: Each element has one copy per window it is assigned to --> num windows * num elements in window --> example: tumbline is one copy, sliding(n,m) is n/m copies --> per key Pre-aggregation: - if reduce or fold is set -> one element per window (rather than num elements in window) - evictor voids pre-aggregation from the perspective of state Special rules: - fold cannot pre-aggregate on session windows (and other merging windows) (8) Non-keyed windows - all elements through the same windows - currently not parallel - possible parallel in the future when having pre-aggregation functions - inherently (by definition) produce a result stream with parallelism one - state similar to one key of keyed windows