[FLINK-9113] Data loss in BucketingSink when writing to local filesystem - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.4.3, 1.5.0
Component/s: Connectors / Common
Labels:
None

Description

For local filesystems, it is not guaranteed that the data is flushed to disk during checkpointing. This leads to data loss in cases of TaskManager failures when writing to a local filesystem org.apache.hadoop.fs.LocalFileSystem. The flush() method returns a written length but the data is not written into the file (thus the valid length might be greater than the actual file size). hsync and hflush have no effect either.

It seems that this behavior won't be fixed in the near future: https://issues.apache.org/jira/browse/HADOOP-7844

One solution would be to call close() on a checkpoint for local filesystems, even though this would lead to performance decrease. If we don't fix this issue, we should at least add proper documentation for it.

Attachments

Issue Links

links to

GitHub Pull Request #5811

GitHub Pull Request #5861

Activity

People

Assignee:: Timo Walther

Reporter:: Timo Walther

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 29/Mar/18 16:13

Updated:: 30/Jul/18 06:28

Resolved:: 19/Apr/18 12:43