Issue Details (XML | Word | Printable)

Key: HADOOP-2559
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Lohit Vijayarenu
Reporter: Runping Qi
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

DFS should place one replica per rack

Created: 09/Jan/08 04:02 PM   Updated: 08/Jul/09 04:42 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.17.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-2559-1-2.patch 2008-03-13 11:57 PM Lohit Vijayarenu 7 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1-3.patch 2008-03-14 12:11 AM Lohit Vijayarenu 7 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1-4.patch 2008-03-17 05:58 PM Lohit Vijayarenu 7 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1.patch 2008-03-10 06:44 PM Lohit Vijayarenu 8 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1.patch 2008-02-21 03:59 PM Lohit Vijayarenu 8 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-2.patch 2008-02-22 09:54 AM Lohit Vijayarenu 11 kB
Image Attachments:

1. Patch1_Block_Report.png.jpg
(52 kB)

2. Patch1_Rack_Node_Mapping.jpg
(34 kB)

3. Patch2 Block Report.jpg
(56 kB)

4. Patch2_Rack_Node_Mapping.jpg
(35 kB)

5. Trunk_Block_Report.png
(30 kB)

6. Trunk_Rack_Node_Mapping.jpg
(33 kB)
Issue Links:
dependent
 

Release Note: Change DFS block placement to allocate the first replica locally, the second off-rack, and the third intra-rack from the second.
Resolution Date: 17/Mar/08 11:54 PM


 Description  « Hide
Currently, when writing out a block, dfs will place one copy to a local data node, one copy to a rack local node
and another one to a remote node. This leads to a number of undesired properties:

1. The block will be rack-local to two tacks instead of three, reducing the advantage of rack locality based scheduling by 1/3.

2. The Blocks of a file (especiallya large file) are unevenly distributed over the nodes: One third will be on the local node, and two thirds on the nodes on the same rack. This may make some nodes full much faster than others,
increasing the need of rebalancing. Furthermore, this also make some nodes become "hot spots" if those big
files are popular and accessed by many applications.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Robert Chansler made changes - 31/Jan/08 11:59 PM
Field Original Value New Value
Assignee lohit vijayarenu [ lohit ]
Lohit Vijayarenu made changes - 21/Feb/08 03:59 PM
Attachment HADOOP-2559-1.patch [ 12376130 ]
Lohit Vijayarenu made changes - 22/Feb/08 09:54 AM
Attachment HADOOP-2559-2.patch [ 12376211 ]
Lohit Vijayarenu made changes - 05/Mar/08 10:56 AM
Attachment Trunk_Block_Report.png [ 12377152 ]
Lohit Vijayarenu made changes - 05/Mar/08 10:59 AM
Attachment Patch1_Block_Report.png.jpg [ 12377153 ]
Lohit Vijayarenu made changes - 05/Mar/08 11:00 AM
Attachment Patch2 Block Report.jpg [ 12377154 ]
Lohit Vijayarenu made changes - 05/Mar/08 11:02 AM
Attachment Trunk_Rack_Node_Mapping.jpg [ 12377155 ]
Lohit Vijayarenu made changes - 05/Mar/08 11:03 AM
Attachment Patch1_Rack_Node_Mapping.jpg [ 12377156 ]
Lohit Vijayarenu made changes - 05/Mar/08 11:04 AM
Attachment Patch2_Rack_Node_Mapping.jpg [ 12377157 ]
Lohit Vijayarenu made changes - 10/Mar/08 06:44 PM
Attachment HADOOP-2559-1.patch [ 12377544 ]
Lohit Vijayarenu made changes - 10/Mar/08 06:45 PM
Status Open [ 1 ] Patch Available [ 10002 ]
Lohit Vijayarenu made changes - 13/Mar/08 11:57 PM
Attachment HADOOP-2559-1-2.patch [ 12377849 ]
Lohit Vijayarenu made changes - 14/Mar/08 12:11 AM
Attachment HADOOP-2559-1-3.patch [ 12377850 ]
dhruba borthakur made changes - 17/Mar/08 06:49 AM
Link This issue is depended upon by HADOOP-2094 [ HADOOP-2094 ]
Lohit Vijayarenu made changes - 17/Mar/08 05:57 PM
Status Patch Available [ 10002 ] Open [ 1 ]
Lohit Vijayarenu made changes - 17/Mar/08 05:58 PM
Attachment HADOOP-2559-1-4.patch [ 12378053 ]
Lohit Vijayarenu made changes - 17/Mar/08 06:05 PM
Status Open [ 1 ] Patch Available [ 10002 ]
Chris Douglas made changes - 17/Mar/08 11:54 PM
Resolution Fixed [ 1 ]
Fix Version/s 0.17.0 [ 12312913 ]
Status Patch Available [ 10002 ] Resolved [ 5 ]
Lohit Vijayarenu made changes - 17/Apr/08 05:10 AM
Description
Currently, when writing out a block, dfs will place one copy to a local data node, one copy to a rack local node
and another one to a remote node. This leads to a number of undesired properties:

1. The block will be rack-local to two tacks instead of three, reducing the advantage of rack locality based scheduling by 1/3.

2. The Blocks of a file (especiallya large file) are unevenly distributed over the nodes: One third will be on the local node, and two thirds on the nodes on the same rack. This may make some nodes full much faster than others,
increasing the need of rebalancing. Furthermore, this also make some nodes become "hot spots" if those big
files are popular and accessed by many applications.


Currently, when writing out a block, dfs will place one copy to a local data node, one copy to a rack local node
and another one to a remote node. This leads to a number of undesired properties:

1. The block will be rack-local to two tacks instead of three, reducing the advantage of rack locality based scheduling by 1/3.

2. The Blocks of a file (especiallya large file) are unevenly distributed over the nodes: One third will be on the local node, and two thirds on the nodes on the same rack. This may make some nodes full much faster than others,
increasing the need of rebalancing. Furthermore, this also make some nodes become "hot spots" if those big
files are popular and accessed by many applications.


Release Note Change DFS block placement to allocate the first replica locally, the second off-rack, and the third intra-rack from the second.
Nigel Daley made changes - 21/May/08 08:05 PM
Status Resolved [ 5 ] Closed [ 6 ]
Owen O'Malley made changes - 08/Jul/09 04:42 PM
Component/s dfs [ 12310710 ]