Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.15.1
-
None
-
None
-
Incompatible change, Reviewed
-
Description
Request for being able to append to files in HDFS has been raised a couple of times on the list of late. For one example, see http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193. Other mail describes folks' workarounds because this feature is lacking: e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 (Later on this thread, Jim Kellerman re-raises the HBase need of this feature). HADOOP-337 'DFS files should be appendable' makes mention of file append but it was opened early in the life of HDFS when the focus was more on implementing the basics rather than adding new features. Interest fizzled. Because HADOOP-337 is also a bit of a grab-bag – it includes truncation and being able to concurrently read/write – rather than try and breathe new life into HADOOP-337, instead, here is a new issue focused on file append. Ultimately, being able to do as the google GFS paper describes – having multiple concurrent clients making 'Atomic Record Append' to a single file would be sweet but at least for a first cut at this feature, IMO, a single client appending to a single HDFS file letting the application manage the access would be sufficent.
Attachments
Attachments
Issue Links
- depends upon
-
HADOOP-3283 Need a mechanism for data nodes to update generation stamps.
- Closed
-
HADOOP-3310 Lease recovery for append
- Closed
- is blocked by
-
HADOOP-2565 DFSPath cache of FileStatus can become stale
- Resolved
-
HADOOP-2655 Copy on write for data and metadata files in the presence of snapshots
- Closed
-
HADOOP-2656 Support for upgrading existing cluster to facilitate appends to HDFS files
- Closed
-
HADOOP-3113 DFSOututStream.flush() should flush data to real block file on DataNode.
- Closed
-
HADOOP-3176 Change lease record when a open-for-write-file gets renamed
- Closed
-
HADOOP-3503 Race condition when client and namenode start block recovery simultaneously
- Closed
-
HADOOP-3161 TestFileAppend fails on Mac since HADOOP-2655 was committed
- Closed
-
HADOOP-2658 Design and Implement a Test Plan to support appends to HDFS files
- Closed
-
HADOOP-3201 namenode should be able to retrieve block metadata from a datanode
- Closed
-
HADOOP-3250 Extend FileSystem API to allow appending to files
- Closed
-
HADOOP-1707 Remove the DFS Client disk-based cache
- Closed
-
HADOOP-2345 new transactions to support HDFS Appends
- Closed
-
HADOOP-3177 Expose DFSOutputStream.fsync API though the FileSystem interface
- Closed
-
HADOOP-3515 Protocol changes to allow appending to the last partial crc chunk of a file
- Closed
- is depended upon by
-
HADOOP-3790 Add more unit tests to test appending to files in HDFS
- Closed
- is related to
-
HDFS-200 In HDFS, sync() not yet guarantees data available to the new readers
- Closed
-
HADOOP-337 DFS files should be appendable
- Closed
- is superceded by
-
HDFS-265 Revisit append
- Closed
- relates to
-
HADOOP-3834 Checkin the design document for HDFS appends into source control repository
- Resolved
-
HADOOP-89 files are not visible until they are closed
- Closed
-
HADOOP-1497 Possibility of duplicate blockids if dead-datanodes come back up after corresponding files were deleted
- Closed
-
HADOOP-3241 DFSFileInfo should also have field to say if the file is underconstrction
- Closed
-
HADOOP-3329 DatanodeDescriptor objects stored in FSImage may be out dated.
- Closed
-
HADOOP-3832 Create more unit tests for testing HDFS appends
- Closed
-
HADOOP-2657 Enhancements to DFSClient to support flushing data at any point in time
- Closed
-
HDFS-2253 Create a benchmark to measure performance of "append" to HDFS files
- Resolved