[HADOOP-1702] Reduce buffer copies when data is written to DFS - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.14.0
Fix Version/s: 0.18.0
Component/s: None
Labels:
None

Hadoop Flags:

Incompatible change, Reviewed
Release Note:
Reduced buffer copies as data is written to HDFS. The order of sending data bytes and control information has changed, but this will not be observed by client applications.

Description

~~HADOOP-1649~~ adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.

       (1)                 (2)          (3)                 (5)
   +---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror  
   | (buffer)                  (socket)           |  (4)
   |                                              +--||--+
 =====                                                    |
 =====                                                  =====
 (disk)                                                 =====

Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).

Similarly some copies can be reduced when clients read data from the DFS.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-1702.patch
23/Feb/08 01:19
34 kB
Raghu Angadi
HADOOP-1702.patch
16/Apr/08 22:20
39 kB
Raghu Angadi
HADOOP-1702.patch
17/Apr/08 14:55
39 kB
Raghu Angadi
HADOOP-1702.patch
17/Apr/08 15:11
39 kB
Raghu Angadi
HADOOP-1702.patch
25/Apr/08 01:05
39 kB
Raghu Angadi
HADOOP-1702.patch
07/May/08 00:16
41 kB
Raghu Angadi
HADOOP-1702.patch
08/May/08 19:26
41 kB
Raghu Angadi
HADOOP-1702.patch
08/May/08 19:31
42 kB
Raghu Angadi
HADOOP-1702.patch
12/May/08 21:45
42 kB
Raghu Angadi

Issue Links

depends upon

HADOOP-2758 Reduce memory copies when data is read from DFS

Closed

relates to

HADOOP-2154 Non-interleaved checksums would optimize block transfers.

Resolved

Activity

People

Assignee:: Raghu Angadi

Reporter:: Raghu Angadi

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 09/Aug/07 21:43

Updated:: 08/Jul/09 16:42

Resolved:: 14/May/08 06:35