Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1184

mapred.reduce.slowstart.completed.maps is too low by default

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.20.1, 0.20.2
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      By default, this value is set to 5%. I believe for most real world situations the code isn't efficient enough to be set this low. This should be higher, probably around the 50% mark, especially given the predominance of non-FIFO schedulers.

        Issue Links

          Activity

          Hide
          Allen Wittenauer added a comment -

          For the edge case where jobs benefit from having .05%, they can continue to set this. The average case, from my experience, is that this is way way way too low. To re-iterate: I'd like to change the default that Hadoop ships with to something more reasonable for the average case.

          Show
          Allen Wittenauer added a comment - For the edge case where jobs benefit from having .05%, they can continue to set this. The average case , from my experience, is that this is way way way too low. To re-iterate: I'd like to change the default that Hadoop ships with to something more reasonable for the average case .
          Hide
          Vinod Kumar Vavilapalli added a comment -

          As such I an neutral to this change. But as I said before, this is site-specific, job-specific. And this comment I've made only after seeing different opinions on this issue itself.

          Allen says in the description "This should be higher, probably around the 50% mark".
          Hong says here "Another case, for tiny jobs that require fast turn around time, it would be better if we set the percentage to be 0."

          Either there is a cross-talk or I am missing something or these are different use-cases already. If we can agree to a default that is agreeable to everyone here, I am fine.

          Show
          Vinod Kumar Vavilapalli added a comment - As such I an neutral to this change. But as I said before, this is site-specific, job-specific. And this comment I've made only after seeing different opinions on this issue itself. Allen says in the description "This should be higher, probably around the 50% mark". Hong says here "Another case, for tiny jobs that require fast turn around time, it would be better if we set the percentage to be 0." Either there is a cross-talk or I am missing something or these are different use-cases already. If we can agree to a default that is agreeable to everyone here, I am fine.
          Hide
          Allen Wittenauer added a comment -

          >Why not let it be and change site-specific, job-specific configuration?

          In my experience, users don't set this until they've been around the Hadoop block for a while, and even then, this one is easy to miss.

          The other reality is that few users only run "one" job. It is much more typical to run a series of jobs as part of a work flow. Doing specific, low-level tuning of every knob for every job is asking too much. For those users that do want to do that, then they'll eventually hit this and tune appropriately. But that doesn't mean we shouldn't ship a 'reasonable' default until they get around to setting it themselves.

          >I think Allen's point is that the default 5% may be too low from the utilization perspective.

          ... and that's exactly my point. Inexperienced users wonder why all their reduce slots are not being utilized to get the max throughput of the grid. They have one big job that has all the reduce slots gone, sometimes for hours at a time, when a smaller job has all of its maps finished and just needs a handful of reduces to go. By setting this to reasonable default, chances are this very common case will disappear out-of-the-box.

          While I think it would be great to see this tunable go away, that's not where we are at today. So let's just set this to something reasonable and then look at the bigger problem at some later date. There are bigger fish to fry.

          Show
          Allen Wittenauer added a comment - >Why not let it be and change site-specific, job-specific configuration? In my experience, users don't set this until they've been around the Hadoop block for a while, and even then, this one is easy to miss. The other reality is that few users only run "one" job. It is much more typical to run a series of jobs as part of a work flow. Doing specific, low-level tuning of every knob for every job is asking too much. For those users that do want to do that, then they'll eventually hit this and tune appropriately. But that doesn't mean we shouldn't ship a 'reasonable' default until they get around to setting it themselves. >I think Allen's point is that the default 5% may be too low from the utilization perspective. ... and that's exactly my point. Inexperienced users wonder why all their reduce slots are not being utilized to get the max throughput of the grid. They have one big job that has all the reduce slots gone, sometimes for hours at a time, when a smaller job has all of its maps finished and just needs a handful of reduces to go. By setting this to reasonable default, chances are this very common case will disappear out-of-the-box. While I think it would be great to see this tunable go away, that's not where we are at today. So let's just set this to something reasonable and then look at the bigger problem at some later date. There are bigger fish to fry.
          Hide
          Hong Tang added a comment -

          I think Allen's point is that the default 5% may be too low from the utilization perspective.

          My point (which may be shared with Matei) is that this really could be adaptively tuned by the MR framework (thus eliminating the need of a configuration knob). Finally, back to my comment on turn around time, i think users should specify high level optimization objectives such as whether they care more about response time or throughput, and MR framework should adjust related parameters automatically. Granted, this is probably beyond the scope of this jira.

          Show
          Hong Tang added a comment - I think Allen's point is that the default 5% may be too low from the utilization perspective. My point (which may be shared with Matei) is that this really could be adaptively tuned by the MR framework (thus eliminating the need of a configuration knob). Finally, back to my comment on turn around time, i think users should specify high level optimization objectives such as whether they care more about response time or throughput, and MR framework should adjust related parameters automatically. Granted, this is probably beyond the scope of this jira.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          This is a per-job configuration. And all the issues quoted above seem to be characteristic of the job in question. As such, no default value will ever cater to all the job characteristics. Why not let it be and change site-specific, job-specific configuration?

          Show
          Vinod Kumar Vavilapalli added a comment - This is a per-job configuration. And all the issues quoted above seem to be characteristic of the job in question. As such, no default value will ever cater to all the job characteristics. Why not let it be and change site-specific, job-specific configuration?
          Hide
          Matei Zaharia added a comment -

          Yeah, actually the 5% setting can be a source of latency for small jobs in my experience, because the maps will finish at roughly the same time, and you then need to wait a few seconds for a reducer to start up and to get the map completion events from the JobTracker. For these jobs, it might make sense to look at the rate at which maps are reporting progress and launch the reducers when it looks like the map will finish in the next 5 seconds. There are many other things that could be done to decrease the latency for small jobs however.

          Show
          Matei Zaharia added a comment - Yeah, actually the 5% setting can be a source of latency for small jobs in my experience, because the maps will finish at roughly the same time, and you then need to wait a few seconds for a reducer to start up and to get the map completion events from the JobTracker. For these jobs, it might make sense to look at the rate at which maps are reporting progress and launch the reducers when it looks like the map will finish in the next 5 seconds. There are many other things that could be done to decrease the latency for small jobs however.
          Hide
          Hong Tang added a comment -

          Another case, for tiny jobs that require fast turn around time, it would be better if we set the percentage to be 0.

          Show
          Hong Tang added a comment - Another case, for tiny jobs that require fast turn around time, it would be better if we set the percentage to be 0.
          Hide
          Matei Zaharia added a comment -

          This is a good idea. Ideally though, we might actually want slow start to depend on the amount of map output data and the rate at which data can be copied. If you have a job with only a few MB of map output per reducer, setting slow start as high as 95% isn't going to impact your response time too much. On the other hand, if you have a job where the maps "explode" the output and you know that the bulk of your time will be spent in the shuffle phase, you might want to set it lower.

          Show
          Matei Zaharia added a comment - This is a good idea. Ideally though, we might actually want slow start to depend on the amount of map output data and the rate at which data can be copied. If you have a job with only a few MB of map output per reducer, setting slow start as high as 95% isn't going to impact your response time too much. On the other hand, if you have a job where the maps "explode" the output and you know that the bulk of your time will be spent in the shuffle phase, you might want to set it lower.

            People

            • Assignee:
              Unassigned
              Reporter:
              Allen Wittenauer
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development