Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2275

RaidNode should monitor and fix blocks that violate RAID block placement

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • contrib/raid
    • None

    Description

      When files are RAIDed, it is important to keep blocks in each RAID stripe and the corresponding parity blocks on as many different machines as possible. This ensures minimal probability of data loss when data nodes go dead.

      BlockPlacementPolicyRaid ensures that parity blocks are not located on the same machines as the source blocks. But source blocks placement is not controlled directly in this manner. Instead, source blocks are allowed to be created using the default policy. After a source file is RAIDed, its replication is increased, and then decreased. BlockPlacementPolicyRaid then tries to keep the source blocks well-located when excess blocks are deleted. This is not guaranteed to ensure the correct block placement for RAID.

      Also, if blocks are moved around by the balancer, the block placement could be violated.

      We need periodic monitoring of block placement of RAIDed files and the corresponding parity blocks.

      Attachments

        1. MAPREDUCE-2275.txt
          48 kB
          Scott Chen

        Activity

          People

            rvadali Ramkumar Vadali
            rvadali Ramkumar Vadali
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: