[HDFS-3566] Custom Replication Policy for Azure - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1-win
Component/s: namenode
Labels:
None

Target Version/s:

1-win
Hadoop Flags:

Reviewed

Description

Azure has logical concepts like fault and upgrade domains. Each fault domain spans multiple upgrade domains and each upgrade domain spans multiple fault domains. Machines are spread typically evenly across both fault and upgrade domains. Fault domain failures are typically catastrophic/unplanned failures and data loss possibility is high. An upgrade domain can be taken down by azure for maintenance periodically. Each time an upgrade domain is taken down a small percentage of machines in the upgrade domain(typically 1-2%) are replaced due to disk failures, thus losing data. Assuming the default replication factor 3, any 3 data nodes going down at the same time would mean potential data loss. So, it is important to have a policy that spreads replicas across both fault and upgrade domains to ensure practically no data loss. The problem here is two dimensional and the default policy in hadoop is one-dimensional. This policy would spread the datanodes across atleast 2 fault domains and three upgrade domains to prevent data loss.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

azurepolicy-branch-1-win.patch
14/Sep/12 07:04
56 kB
Sumadhur Reddy Bolli
AzureBlockPlacementPolicy.pdf
14/Sep/12 07:12
49 kB
Sumadhur Reddy Bolli

Issue Links

is blocked by

HDFS-3564 Design enhancements to the pluggable blockplacementpolicy

Resolved

is related to

HADOOP-8079 Proposal for enhancements to Hadoop for Windows Server and Windows Azure development and runtime environments

Resolved

HDFS-7541 Upgrade Domains in HDFS

Resolved

Activity

People

Assignee:: Sumadhur Reddy Bolli

Reporter:: Sumadhur Reddy Bolli

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 26/Jun/12 01:20

Updated:: 03/Mar/15 19:47

Resolved:: 21/Sep/12 02:51