[FLINK-14968] Kerberized YARN on Docker test (custom fs plugin) fails on Travis - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Duplicate
Affects Version/s: 1.10.0
Fix Version/s: None
Component/s: Runtime / Coordination, Tests
Labels:
- test-stability

Description

This change made the test flaky: https://github.com/apache/flink/commit/749965348170e4608ff2a23c9617f67b8c341df5. It changes the job to have two sources instead of one which, under normal circumstances, requires too many slots to run and therefore the job will fail.

The setup of this test is very intricate, we configure YARN to have two NodeManagers with 2500mb memory each: https://github.com/apache/flink/blob/413a77157caf25dbbfb8b0caaf2c9e12c7374d98/flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/config/yarn-site.xml#L39. We run the job with parallelism 3 and configure Flink to use 1000mb as TaskManager memory and 1000mb of JobManager memory. This means that the job fits into the YARN memory budget but more TaskManagers would not fit. We also don't simply increase the YARN resources because we want the Flink job to use TMs on different NMs because we had a bug where Kerberos config file shipping was not working correctly but the bug was not materialising if all TMs where on the same NM.

https://api.travis-ci.org/v3/job/612782888/log.txt

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

run-with-3-slots.txt
28/Nov/19 17:34
141 kB
Aljoscha Krettek
run-with-4-slots.txt
28/Nov/19 17:34
153 kB
Aljoscha Krettek

Issue Links

is a clone of

FLINK-14834 Kerberized YARN on Docker test fails on Travis

Closed

is caused by

FLINK-14382 Incorrect handling of FLINK_PLUGINS_DIR on Yarn

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Gary Yao

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 27/Nov/19 08:18

Updated:: 28/Nov/19 17:57

Resolved:: 28/Nov/19 17:57