Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-9000 Add automated cluster framework tests
  3. FLINK-9004

Cluster test: Run general purpose job with failures with Yarn session

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.6.0
    • Component/s: Tests

      Description

      Similar to FLINK-8973, we should run the general purpose job (FLINK-8971) on a Yarn session cluster and simulate failures.

      The job jar should be ill-packaged, meaning that we include too many dependencies in the user jar. We should include the Scala library, Hadoop and Flink itself to verify that there are no class loading issues.

      The general purpose job should run with misbehavior activated. Additionally, we should simulate at least the following failure scenarios:

      • Kill Flink processes
      • Kill connection to storage system for checkpoints and jobs
      • Simulate network partition

      We should run the test at least with the following state backend: RocksDB incremental async and checkpointing to HDFS.

        Attachments

          Activity

            People

            • Assignee:
              gjy Gary Yao
              Reporter:
              trohrmann Till Rohrmann
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: