Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2636

Scheduling over disks horizontally

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • job submission
    • None

    Description

      Based on this message: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201106.mbox/browser

      The JT schedules tasks on nodes based on metadata it gets from the NN. The namenode does not know on which disk a block resides. It might happen that on a node running 4 tasks, all read from the same disk. This can affect performance.

      An optimization might be to schedule horizontally over disks instead of nodes. Any ideas?

      Attachments

        Activity

          People

            Unassigned Unassigned
            evertlammerts Evert Lammerts
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated: