Description
This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.
This verification mainly contains two parts.
Part 1. Run without remote storage.
This part mainly is to verify the new mode can use the Memory tier and Disk tier dynamically when shuffling.
Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds q1.sql). When the resource is enough, then the upstream and the downstream can run at the same time.
Part2. Run with remote storage.
This part mainly is to verify the new mode can use the Memory tier, Disk tier, Remote tier dynamically when shuffling.
2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_SELECTIVE)
2.2 set the remote storage path with the option(taskmanager.network.hybrid-shuffle.remote.path: oss://flink-runtime/runtime/shuffle, note that the path oss://flink-runtime/runtime/shuffle in oss should be exist).
2.3 Modify the
option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to 1, compile the package, then run a simple job. For example(tpcds q1.sql). Check the shuffle data is written to the remote storage in the path oss://flink-runtime/runtime/shuffle.