Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
The FSDataOutputSteam.flush() api is supposed to flush all data to the underlying stream. However, for LocalFileSystem, the flush APi does not flush the last partial CRC chunk.
One solution is described in HADOOP-2657: We should change FSOutputStream to implement Seekable, and have the default implementation of seek throw an IOException, then use this in CheckSumFileSystem to rewind and overwrite the checksum. Then folks will only fail if they attempt to write more data after they've flushed on a ChecksumFileSystem that doesn't support seek. I don't think we will have any filesystems that both extend CheckSumFileSystem and can't support seek. Only LocalFileSystem currently extends CheckSumFileSystem, and it does support seek. So flush() shouldn't ever fail for existing FileSystem's, but seek() will fail for most output streams (probably all except local).
Attachments
Issue Links
- is related to
-
HADOOP-2657 Enhancements to DFSClient to support flushing data at any point in time
- Closed