Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0, 1.3.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      Suggested Outline:

      Windows
      
      (0) Outline: The anatomy of a window operation
      
        stream
           [.keyBy(...)]         <-  keyed versus non-keyed windows
            .window(...)         <-  required: "assigner"
           [.trigger(...)]       <-  optional: "trigger" (else default trigger)
           [.evictor(...)]       <-  optional: "evictor" (else no evictor)
           [.allowedLateness()]  <-  optional, else zero
            .reduce/fold/apply() <-  required: "function"
      
      (1) Types of windows
      
        - tumble
        - slide
        - session
        - global
      
      (2) Pre-defined windows
      
         timeWindow() (tumble, slide)
         countWindow() (tumble, slide)
           - mention that count windows are inherently
             resource leaky unless limited key space
      
      (3) Window Functions
      
        - apply: most basic, iterates over elements in window
        
        - aggregating: reduce and fold, can be used with "apply()" which will get one element
        
        - forward reference to state size section
      
      (4) Advanced Windows
      
        - assigner
          - simple
          - merging
        - trigger
          - registering timers (processing time, event time)
          - state in triggers
        - life cycle of a window
          - create
          - state
          - cleanup
            - when is window contents purged
            - when is state dropped
            - when is metadata (like merging set) dropped
      
      (5) Late data
        - picture
        - fire vs fire_and_purge: late accumulates vs late resurrects (cf discarding mode)
        
      (6) Evictors
        - TDB
        
      (7) State size: How large will the state be?
      
      Basic rule: Each element has one copy per window it is assigned to
        --> num windows * num elements in window
        --> example: tumbline is one copy, sliding(n,m) is n/m copies
        --> per key
      
      Pre-aggregation:
        - if reduce or fold is set -> one element per window (rather than num elements in window)
        - evictor voids pre-aggregation from the perspective of state
      
      Special rules:
        - fold cannot pre-aggregate on session windows (and other merging windows)
      
      
      (8) Non-keyed windows
        - all elements through the same windows
        - currently not parallel
        - possible parallel in the future when having pre-aggregation functions
        - inherently (by definition) produce a result stream with parallelism one
        - state similar to one key of keyed windows
      

        Attachments

          Activity

            People

            • Assignee:
              kkl0u Kostas Kloudas
              Reporter:
              StephanEwen Stephan Ewen
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: