Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16454

Consider adding a per-batch transform for structured streaming

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • None
    • None
    • Structured Streaming

    Description

      The new structured streaming API lacks the DStream functionality of transform (which allowed one to mix in existing RDD transformation logic). It would be useful to be able to do per-batch (even without any specific gaurantees about the batch being complete provided you eventually get called with the "catch up" records) processing as was done in the DStream API.

      This might be useful for implementing Streaming Machine Learning on Structured Streaming.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              holden Holden Karau
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: