Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2275

RaidNode should monitor and fix blocks that violate RAID block placement

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: contrib/raid
    • Labels:


      When files are RAIDed, it is important to keep blocks in each RAID stripe and the corresponding parity blocks on as many different machines as possible. This ensures minimal probability of data loss when data nodes go dead.

      BlockPlacementPolicyRaid ensures that parity blocks are not located on the same machines as the source blocks. But source blocks placement is not controlled directly in this manner. Instead, source blocks are allowed to be created using the default policy. After a source file is RAIDed, its replication is increased, and then decreased. BlockPlacementPolicyRaid then tries to keep the source blocks well-located when excess blocks are deleted. This is not guaranteed to ensure the correct block placement for RAID.

      Also, if blocks are moved around by the balancer, the block placement could be violated.

      We need periodic monitoring of block placement of RAIDed files and the corresponding parity blocks.


        1. MAPREDUCE-2275.txt
          48 kB
          Scott Chen



            • Assignee:
              rvadali Ramkumar Vadali
              rvadali Ramkumar Vadali


              • Created:

                Issue deployment