Details
-
Sub-task
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
1.5.0
Description
Similar to FLINK-8973, we should run the general purpose job (FLINK-8971) on a Yarn session cluster and simulate failures.
The job jar should be ill-packaged, meaning that we include too many dependencies in the user jar. We should include the Scala library, Hadoop and Flink itself to verify that there are no class loading issues.
The general purpose job should run with misbehavior activated. Additionally, we should simulate at least the following failure scenarios:
- Kill Flink processes
- Kill connection to storage system for checkpoints and jobs
- Simulate network partition
We should run the test at least with the following state backend: RocksDB incremental async and checkpointing to HDFS.