Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.2.0, 1.3.0
    • Documentation
    • None

    Description

      Suggested Outline:

      Windows
      
      (0) Outline: The anatomy of a window operation
      
        stream
           [.keyBy(...)]         <-  keyed versus non-keyed windows
            .window(...)         <-  required: "assigner"
           [.trigger(...)]       <-  optional: "trigger" (else default trigger)
           [.evictor(...)]       <-  optional: "evictor" (else no evictor)
           [.allowedLateness()]  <-  optional, else zero
            .reduce/fold/apply() <-  required: "function"
      
      (1) Types of windows
      
        - tumble
        - slide
        - session
        - global
      
      (2) Pre-defined windows
      
         timeWindow() (tumble, slide)
         countWindow() (tumble, slide)
           - mention that count windows are inherently
             resource leaky unless limited key space
      
      (3) Window Functions
      
        - apply: most basic, iterates over elements in window
        
        - aggregating: reduce and fold, can be used with "apply()" which will get one element
        
        - forward reference to state size section
      
      (4) Advanced Windows
      
        - assigner
          - simple
          - merging
        - trigger
          - registering timers (processing time, event time)
          - state in triggers
        - life cycle of a window
          - create
          - state
          - cleanup
            - when is window contents purged
            - when is state dropped
            - when is metadata (like merging set) dropped
      
      (5) Late data
        - picture
        - fire vs fire_and_purge: late accumulates vs late resurrects (cf discarding mode)
        
      (6) Evictors
        - TDB
        
      (7) State size: How large will the state be?
      
      Basic rule: Each element has one copy per window it is assigned to
        --> num windows * num elements in window
        --> example: tumbline is one copy, sliding(n,m) is n/m copies
        --> per key
      
      Pre-aggregation:
        - if reduce or fold is set -> one element per window (rather than num elements in window)
        - evictor voids pre-aggregation from the perspective of state
      
      Special rules:
        - fold cannot pre-aggregate on session windows (and other merging windows)
      
      
      (8) Non-keyed windows
        - all elements through the same windows
        - currently not parallel
        - possible parallel in the future when having pre-aggregation functions
        - inherently (by definition) produce a result stream with parallelism one
        - state similar to one key of keyed windows
      

      Attachments

        Activity

          People

            kkl0u Kostas Kloudas
            sewen Stephan Ewen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: