Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.10.0
Description
Travis run https://api.travis-ci.org/v3/job/558897776/log.txt with the race condition.
The race condition consists of the following series of steps:
1) the finishTask of the source puts the POISON_PILL in the mailbox. This will be put as first letter in the queue because we call sendPriorityLetter() in the MailboxProcessor.
2) then in the cancel() of the source (called by the cancelTask()) leads the source out of its run-loop which throws the exception in the test source. The latter, calls again sendPriorityLetter() for the exception, which means that this exception letter may override the previously sent POISON_PILL (because it jumps the line),
3) if the POISON_PILL has already been executed, we are good, if not, then the test harness sets the exception in the StreamTaskTestHarness.TaskThread.error.
To fix that, I would suggest to follow a similar strategy for the root problem https://issues.apache.org/jira/browse/FLINK-13124 as in release-1.9.
Attachments
Issue Links
- links to