Issue Details (XML | Word | Printable)

Key: HDFS-325
Type: Improvement Improvement
Status: Reopened Reopened
Priority: Major Major
Assignee: dhruba borthakur
Reporter: Runping Qi
Votes: 0
Watchers: 1
Available Workflow Actions

Submit Patch
Operations

If you were logged in you would be able to see more operations.
Hadoop HDFS

DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block

Created: 23/Oct/07 05:53 PM   Updated: 27/Oct/09 07:30 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works randomDatanodePartition.patch 2008-01-08 11:41 PM dhruba borthakur 0.7 kB
Issue Links:
Reference
dependent
 


 Description  « Hide

When multiple file system partitions are configured for the data storage of a data node,
it uses a strict round robin policy to decide which partition to use for writing the next block.
This may result in anormaly cases in which the blocks of a file are not evenly distributed across
the partitions. For example, when we use distcp to copy files with each node have 4 mappers running concurrently,
those 4 mappers are writing to DFS at about the same rate. Thus, it is possible that the 4 mappers write out
blocks interleavingly. If there are 4 file system partitions configured for the local data node, it is possible that each mapper will
continue to write its blocks on to the same file system partition.

A simple random placement policy will avoid such anormaly cases, and does not have any obvious drawbacks.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
No work has yet been logged on this issue.