[FLINK-9004] Cluster test: Run general purpose job with failures with Yarn session - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 1.5.0
Fix Version/s: 1.6.0
Component/s: Tests
Labels:
- pull-request-available

Description

Similar to ~~FLINK-8973~~, we should run the general purpose job (~~FLINK-8971~~) on a Yarn session cluster and simulate failures.

The job jar should be ill-packaged, meaning that we include too many dependencies in the user jar. We should include the Scala library, Hadoop and Flink itself to verify that there are no class loading issues.

The general purpose job should run with misbehavior activated. Additionally, we should simulate at least the following failure scenarios:

Kill Flink processes
Kill connection to storage system for checkpoints and jobs
Simulate network partition

We should run the test at least with the following state backend: RocksDB incremental async and checkpointing to HDFS.

Attachments

Issue Links

links to

GitHub Pull Request #6239

GitHub Pull Request #6240

Activity

People

Assignee:: Gary Yao

Reporter:: Till Rohrmann

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 16/Mar/18 13:03

Updated:: 16/Jul/18 09:30

Resolved:: 16/Jul/18 09:30