Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25302

ReducedWindowedDStream not using checkpoints for reduced RDDs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1
    • None
    • DStreams
    • Important

    Description

      When using reduceByKeyAndWindow() using inverse reduce function, it eventually creates a ReducedWindowedDStream. This class creates a reducedDStream but only persists it and does not checkpoint it. The result is that it ends up using cached RDDs and does not cut lineage to the input DStream resulting in eventually caching the input RDDs for much longer than they are needed. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            nikunj Nikunj Bansal
            Tathagata Das Tathagata Das
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: