[FLINK-28964] Release Testing: Verify FLIP-205 Cache in DataStream for Batch Processing - ASF JIRA

XML

Word

Printable

JSON

DataStream API provides the `cache` method to cache the result of a DataStream and reuse it in later jobs with batch execution mode.

I think we should verify:

Follow the doc to write a Flink job that produces cache and a job that consumes cache and submit it to a session cluster(standalone or yarn).
You can remove the source physically after the cache-producing job is finished to verify that the cache-consuming job is not reading from the source. For example, delete the file in the filesystem if you are using a file source.
You can restart the TaskManager after the cache-producing job is finished to verify that the cache-consuming job will re-compute the result.