[FLINK-20654] Unaligned checkpoint recovery may lead to corrupted data stream - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.12.0, 1.12.1
Fix Version/s: 1.13.0, 1.12.3
Component/s: Runtime / Checkpointing
Labels:
- pull-request-available
- test-stability

Description

Fix of ~~FLINK-20433~~ shows potential corruption after recovery for all variations of UnalignedCheckpointITCase.

To reproduce, run UCITCase a couple hundreds times. The issue showed for me in:

execute [Parallel union, p = 5]
execute [Parallel union, p = 10]
execute [Parallel cogroup, p = 5]
execute [parallel pipeline with remote channels, p = 5]
with decreasing frequency.

The issue manifests as one of the following issues:

stream corrupted exception
EOF exception
assertion failure in NUM_LOST or NUM_OUT_OF_ORDER
(for union) ArithmeticException overflow (because the number that should be [0;100000] has been mis-deserialized)

Attachments

Issue Links

causes

FLINK-21104 UnalignedCheckpointITCase.execute failed with "IllegalStateException"

Closed

is duplicated by

FLINK-20309 UnalignedCheckpointTestBase.execute is failed

Closed

FLINK-20662 UnalignedCheckpointITCase.execute failed with IndexOutOfBoundsException

Closed

FLINK-20744 org.apache.flink.test.checkpointing.UnalignedCheckpointITCase fails due to java.lang.ArrayIndexOutOfBoundsException

Closed

relates to

FLINK-20960 Add warning in 1.12 release notes about potential corrupt data stream with unaligned checkpoint

Closed

links to

GitHub Pull Request #14445

GitHub Pull Request #14471

GitHub Pull Request #14509

GitHub Pull Request #14525

GitHub Pull Request #14581

GitHub Pull Request #14582

GitHub Pull Request #14728

GitHub Pull Request #14736

GitHub Pull Request #14797

GitHub Pull Request #14807

GitHub Pull Request #14817

GitHub Pull Request #15252

GitHub Pull Request #15714

(15 links to)

Activity

People

Assignee:: Piotr Nowojski

Reporter:: Arvid Heise

Votes:: 0 Vote for this issue

Watchers:: 16 Start watching this issue

Dates

Created:: 17/Dec/20 13:07

Updated:: 22/Jun/21 14:06

Resolved:: 22/Apr/21 06:05