FLINK-20433 shows potential corruption after recovery for all variations of UnalignedCheckpointITCase.
To reproduce, run UCITCase a couple hundreds times. The issue showed for me in:
- execute [Parallel union, p = 5]
- execute [Parallel union, p = 10]
- execute [Parallel cogroup, p = 5]
- execute [parallel pipeline with remote channels, p = 5]
with decreasing frequency.
The issue manifests as one of the following issues:
- stream corrupted exception
- EOF exception
- assertion failure in NUM_LOST or NUM_OUT_OF_ORDER
- (for union) ArithmeticException overflow (because the number that should be [0;100000] has been mis-deserialized)