Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The DFS client currently uses a staging file on local disk to cache all user-writes to a file. When the staging file accumulates 1 block worth of data, its contents are flushed to a HDFS datanode. These operations occur sequentially.
A simple optimization of allowing the user to write to another staging file while simultaneously uploading the contents of the first staging file to HDFS will improve file-upload performance.
Attachments
Attachments
Issue Links
- blocks
-
HADOOP-1700 Append to files in HDFS
- Closed
- depends upon
-
HADOOP-1927 a datanode in a write pipeline should report an error if the next datanode in the pipeline reported an error
- Closed
- incorporates
-
HDFS-67 /tmp/hadoop-${user}/dfs/tmp/tmp/client-${long}.tmp is not cleanup correctly
- Resolved
- is related to
-
HADOOP-445 Parallel data/socket writing for DFSOutputStream
- Closed
Activity
Transition | Time In Source Status | Execution Times |
---|
|
5d 7h 10m | 11 |
|
153d 17h 40m | 12 |
|
16h 43m | 1 |
|
22d 5h 25m | 1 |