Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6877

Assign map task preferentially to the data node where the split is on faster storage type

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      It would be good to use SSD in HDFS to improve reading/writing performance. However, SSD costs more than HDD, so there is a tradeoff policy ONE-SSD to balance the performance and cost. But there occurs a problem whether applications will read the replication on SSD or not. If applications wouldn’t preferentially read the replication on SSD, the advantage of SSD wouldn’t be fully utilized. The current MapReduce only assign tasks according to data locality. The storage types of all the replications of each split should also be taken into consideration in order to assign map task preferentially to a node where its split is located on a faster storage type.

      Attachments

        Activity

          People

            timmyyao Tim Yao
            timmyyao Tim Yao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: