Details
-
Sub-task
-
Status: Closed
-
Blocker
-
Resolution: Done
-
None
Description
The newly introduced Python DataStream API chaining optimization allows to chain together multiple Python DataStream API operators to avoid serialization and deserialization and improving the performance.
In order to test this new feature I recommend to follow the documentation: https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/datastream/operators/overview/#operator-chaining
Note that the documentation PR was just merged and so it may be still not available. If so, you could also take a look at the documentation PR: https://github.com/apache/flink/pull/16953
The testing should cover but not limited to the following items:
- Chaining could be enabled/disabled according to the documentation
- Chaining works well for operators with multiple inputs / multiple outputs
- Chaining works well in pure Python DataStream API jobs and mixing use of Python Table API & Python DataStream API
Attachments
Issue Links
- is blocked by
-
FLINK-23929 Chaining optimization doesn't handle properly for transformations with multiple outputs
- Closed