Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47698

Current doc section handling-late-data-and-watermarking is misleading after the support for multiple stateful operators

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.4.0
    • None
    • Documentation
    • None

    Description

      This section of the doc which explains the nuances of handling late records using beautiful diagrams has gone out of context and a bit misleading after multiple stateful operators' support was introduced with https://issues.apache.org/jira/browse/SPARK-40925 

       

      Previously watermark is applied for batch N to filter out inputs in batch N. With support for multiple stateful operators, the watermark is applied for batch N-1 instead. The doc section above should reflect this new behavior to avoid confusion.

      Attachments

        1. image-2024-04-02-15-04-34-287.png
          253 kB
          Lingeshwaran Radhakrishnan

        Activity

          People

            Unassigned Unassigned
            lingesh Lingeshwaran Radhakrishnan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: