Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1819

RaidNode should be smarter in submitting Raid jobs

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.22.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The RaidNode currently computes parity files as follows:
      1. Using RaidNode.selectFiles() to figure out what files to raid for a policy
      2. Using #1 repeatedly for each configured policy to accumulate a list of files.
      3. Submitting a mapreduce job with the list of files from #2 using DistRaid.doDistRaid()

      This task addresses the fact that #2 and #3 happen sequentially. The proposal is to submit a separate mapreduce job for the list of files for each policy and use another thread to track the progress of the submitted jobs. This will help reduce the time taken for files to be raided.

        Attachments

        1. MAPREDUCE-1819.patch
          154 kB
          Ramkumar Vadali
        2. MAPREDUCE-1819.patch.2
          153 kB
          Ramkumar Vadali
        3. MAPREDUCE-1819.patch.3
          155 kB
          Ramkumar Vadali
        4. MAPREDUCE-1819.4.patch
          155 kB
          Ramkumar Vadali
        5. MAPREDUCE-1819.5.patch
          155 kB
          Ramkumar Vadali

          Activity

            People

            • Assignee:
              rvadali Ramkumar Vadali
              Reporter:
              rvadali Ramkumar Vadali
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: