Description
Checksumming is currently built into the base FileSystem class. It should instead be optional, with each FileSystem implementation electing whether to use the Hadoop-provided checksum system, or to disable it, or to implement its own custom checksum system.
To implement this, a ChecksumFileSystem implementation can be provided that wraps another FileSystem implementation, implementing checksums as in Hadoop's current mandatory implementation (i.e., as a separate crc file per file that's elided from directory listings). The 'raw' FileSystem methods would be removed. FSDataInputStream and FSDataOutputStream would be made interfaces.
Attachments
Attachments
Issue Links
- incorporates
-
HADOOP-921 tail of file not checked for checksum errors
- Closed
- is related to
-
HADOOP-746 CRC computation and reading should move into a nested FileSystem
- Closed