Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1764

FairScheduler locality delay may put heavy pressure on Jobtracker

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.22.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      FairScheduler locality delay feature holds the scheduling of jobs until it gets good locality.
      This greatly improves the locality of the tasks. Reduce the cost of traffic.

      We have observed the following problem on FairScheduler locality delay:
      We have some machines have older data and some newly added machines do not have important data.
      When these machines send heartbeat, JT scans tasks to find jobs has the right locality.
      Often time, these machines will scan all of the tasks of all the jobs and do not get any tasks.
      Scanning all the tasks on the JT is very costly. This makes JT very slow.
      And these machines often time do not get scheduled. This hurts the cluster utilization.

      Any ideas?

        Activity

        Nigel Daley made changes -
        Fix Version/s 0.22.0 [ 12314184 ]
        Hide
        Joydeep Sen Sarma added a comment -

        it seems better to find out why the index is not helping (assuming it's actually being used) rather than adding another cache on top ..

        Show
        Joydeep Sen Sarma added a comment - it seems better to find out why the index is not helping (assuming it's actually being used) rather than adding another cache on top ..
        Hide
        Scott Chen added a comment -

        Joydeep:

        Matei and I had some discussion and we have also looked the code.
        In JobInProgress, there is such a HashMap of node->[tasks] and rack->[tasks] exists.
        It is not clear to me why this is so slow.

        I agree with your point that we should not leave the slots idle especially in the case that cluster is full of jobs.

        Show
        Scott Chen added a comment - Joydeep: Matei and I had some discussion and we have also looked the code. In JobInProgress, there is such a HashMap of node-> [tasks] and rack-> [tasks] exists. It is not clear to me why this is so slow. I agree with your point that we should not leave the slots idle especially in the case that cluster is full of jobs.
        Hide
        Joydeep Sen Sarma added a comment -

        how expensive (memory wise) would it be to add indices from node->[tasks] and rack->[tasks] based on split information?

        leaving slots idle seems like a real bummer. it would seem better to be greedy and always grab something (especially if the fraction of non-local tasks is within tolerable limits)

        Show
        Joydeep Sen Sarma added a comment - how expensive (memory wise) would it be to add indices from node-> [tasks] and rack-> [tasks] based on split information? leaving slots idle seems like a real bummer. it would seem better to be greedy and always grab something (especially if the fraction of non-local tasks is within tolerable limits)
        Hide
        Scott Chen added a comment -

        One option is to cache the searched result for each TT. So next time we directly skip the TT without the allowed locality level.
        What do you think?

        Show
        Scott Chen added a comment - One option is to cache the searched result for each TT. So next time we directly skip the TT without the allowed locality level. What do you think?
        Scott Chen made changes -
        Assignee Scott Chen [ schen ] Dmytro Molkov [ dms ]
        Scott Chen made changes -
        Field Original Value New Value
        Fix Version/s 0.22.0 [ 12314184 ]
        Affects Version/s 0.22.0 [ 12314184 ]
        Scott Chen created issue -

          People

          • Assignee:
            Dmytro Molkov
            Reporter:
            Scott Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:

              Development