Issue Details (XML | Word | Printable)

Key: HADOOP-2559
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Lohit Vijayarenu
Reporter: Runping Qi
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

DFS should place one replica per rack

Created: 09/Jan/08 04:02 PM   Updated: 08/Jul/09 04:42 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.17.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-2559-1-2.patch 2008-03-13 11:57 PM Lohit Vijayarenu 7 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1-3.patch 2008-03-14 12:11 AM Lohit Vijayarenu 7 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1-4.patch 2008-03-17 05:58 PM Lohit Vijayarenu 7 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1.patch 2008-03-10 06:44 PM Lohit Vijayarenu 8 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-1.patch 2008-02-21 03:59 PM Lohit Vijayarenu 8 kB
Text File Licensed for inclusion in ASF works HADOOP-2559-2.patch 2008-02-22 09:54 AM Lohit Vijayarenu 11 kB
Image Attachments:

1. Patch1_Block_Report.png.jpg
(52 kB)

2. Patch1_Rack_Node_Mapping.jpg
(34 kB)

3. Patch2 Block Report.jpg
(56 kB)

4. Patch2_Rack_Node_Mapping.jpg
(35 kB)

5. Trunk_Block_Report.png
(30 kB)

6. Trunk_Rack_Node_Mapping.jpg
(33 kB)
Issue Links:
dependent
 

Release Note: Change DFS block placement to allocate the first replica locally, the second off-rack, and the third intra-rack from the second.
Resolution Date: 17/Mar/08 11:54 PM


 Description  « Hide
Currently, when writing out a block, dfs will place one copy to a local data node, one copy to a rack local node
and another one to a remote node. This leads to a number of undesired properties:

1. The block will be rack-local to two tacks instead of three, reducing the advantage of rack locality based scheduling by 1/3.

2. The Blocks of a file (especiallya large file) are unevenly distributed over the nodes: One third will be on the local node, and two thirds on the nodes on the same rack. This may make some nodes full much faster than others,
increasing the need of rebalancing. Furthermore, this also make some nodes become "hot spots" if those big
files are popular and accessed by many applications.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
No work has yet been logged on this issue.