[MAPREDUCE-1819] RaidNode should be smarter in submitting Raid jobs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.20.1
Fix Version/s: 0.22.0
Component/s: contrib/raid
Labels:
None

Hadoop Flags:

Reviewed

Description

The RaidNode currently computes parity files as follows:
1. Using RaidNode.selectFiles() to figure out what files to raid for a policy
2. Using #1 repeatedly for each configured policy to accumulate a list of files.
3. Submitting a mapreduce job with the list of files from #2 using DistRaid.doDistRaid()

This task addresses the fact that #2 and #3 happen sequentially. The proposal is to submit a separate mapreduce job for the list of files for each policy and use another thread to track the progress of the submitted jobs. This will help reduce the time taken for files to be raided.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

MAPREDUCE-1819.4.patch
02/Oct/10 05:14
155 kB
Ramkumar Vadali
MAPREDUCE-1819.5.patch
12/Oct/10 02:29
155 kB
Ramkumar Vadali
MAPREDUCE-1819.patch
24/Sep/10 00:13
154 kB
Ramkumar Vadali
MAPREDUCE-1819.patch.2
24/Sep/10 00:48
153 kB
Ramkumar Vadali
MAPREDUCE-1819.patch.3
29/Sep/10 19:44
155 kB
Ramkumar Vadali

Activity

People

Assignee:: Ramkumar Vadali

Reporter:: Ramkumar Vadali

Votes:: 1 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 26/May/10 21:11

Updated:: 12/Dec/11 06:19

Resolved:: 12/Oct/10 18:24