
|
If you were logged in you would be able to see more operations.
|
|
|
|
File Attachments:
|
|
|
Issue Links:
|
Reference
|
|
This issue relates to:
|
|
HADOOP-2154
Non-interleaved checksums would optimize block transfers.
|
|
|
|
|
|
dependent
|
|
This issue depends upon:
|
|
HADOOP-2758
Reduce memory copies when data is read from DFS
|
|
|
|
|
|
|
|
| Hadoop Flags: |
Reviewed, Incompatible change
|
| Release Note: |
Reduced buffer copies as data is written to HDFS. The order of sending data bytes and control information has changed, but this will not be observed by client applications.
|
| Resolution Date: |
14/May/08 06:35 AM
|
HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.
Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).
Similarly some copies can be reduced when clients read data from the DFS.
|
|
Description
|
HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.
Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).
Similarly some copies can be reduced when clients read data from the DFS. |
Show » |
made changes - 10/Oct/07 09:56 PM
| Field |
Original Value |
New Value |
|
Fix Version/s
|
0.15.0
[ 12312565
]
|
|
made changes - 13/Feb/08 11:14 PM
|
Fix Version/s
|
|
0.17.0
[ 12312913
]
|
|
Description
|
HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.
{noformat}
(1) (2) (3) (5)
+---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror
| (buffer) (socket) | (4)
| +--||--+
===== |
===== =====
(disk) =====
{noformat}
Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).
Similarly some copies can be reduced when clients read data from the DFS.
|
HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.
{noformat}
(1) (2) (3) (5)
+---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror
| (buffer) (socket) | (4)
| +--||--+
===== |
===== =====
(disk) =====
{noformat}
Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).
Similarly some copies can be reduced when clients read data from the DFS.
|
made changes - 25/Mar/08 03:03 AM
|
Fix Version/s
|
0.17.0
[ 12312913
]
|
|
made changes - 10/Apr/08 11:24 PM
|
Fix Version/s
|
|
0.18.0
[ 12312972
]
|
made changes - 08/May/08 07:26 PM
|
Hadoop Flags
|
|
[Reviewed]
|
|
Status
|
Open
[ 1
]
|
Patch Available
[ 10002
]
|
made changes - 08/May/08 07:33 PM
|
Release Note
|
|
Reduce buffer copies when data is written to DFS. DataNode takes 30% less CPU. As a result, the format of data DFSClient sends changed and is incompatible with previous clients.
|
|
Hadoop Flags
|
[Reviewed]
|
[Incompatible change, Reviewed]
|
made changes - 09/May/08 09:32 PM
|
Status
|
Patch Available
[ 10002
]
|
Open
[ 1
]
|
made changes - 12/May/08 09:45 PM
|
Hadoop Flags
|
[Reviewed, Incompatible change]
|
[Incompatible change, Reviewed]
|
|
Status
|
Open
[ 1
]
|
Patch Available
[ 10002
]
|
made changes - 14/May/08 06:35 AM
|
Hadoop Flags
|
[Reviewed, Incompatible change]
|
[Incompatible change, Reviewed]
|
|
Resolution
|
|
Fixed
[ 1
]
|
|
Status
|
Patch Available
[ 10002
]
|
Resolved
[ 5
]
|
made changes - 27/Jun/08 07:54 PM
|
Release Note
|
Reduce buffer copies when data is written to DFS. DataNode takes 30% less CPU. As a result, the format of data DFSClient sends changed and is incompatible with previous clients.
|
Reduced buffer copies as data is written to HDFS. The order of sending data bytes and control information has changed, but this will not be observed by client applications.
|
|
Hadoop Flags
|
[Reviewed, Incompatible change]
|
[Incompatible change, Reviewed]
|
made changes - 22/Aug/08 07:50 PM
|
Status
|
Resolved
[ 5
]
|
Closed
[ 6
]
|
made changes - 08/Jul/09 04:42 PM
|
Component/s
|
dfs
[ 12310710
]
|
|
|