Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1521

Protection against incorrectly configured reduces

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 0.22.1
    • Component/s: jobtracker
    • Labels:
      None

      Description

      We've seen a fair number of instances where naive users process huge data-sets (>10TB) with badly mis-configured #reduces e.g. 1 reduce.

      This is a significant problem on large clusters since it takes each attempt of the reduce a long time to shuffle and then run into problems such as local disk-space etc. Then it takes 4 such attempts.

      Proposal: Come up with heuristics/configs to fail such jobs early.

      Thoughts?

      1. resourcestimator-overflow.txt
        1 kB
        Todd Lipcon
      2. resourceestimator-threshold.txt
        2 kB
        Todd Lipcon
      3. MAPREDUCE-1521-trunk.patch
        13 kB
        Mahadev konar
      4. MAPREDUCE-1521-0.20-yahoo.patch
        12 kB
        Mahadev konar
      5. MAPREDUCE-1521-0.20-yahoo.patch
        11 kB
        Mahadev konar
      6. MAPREDUCE-1521-0.20-yahoo.patch
        11 kB
        Mahadev konar
      7. MAPREDUCE-1521-0.20-yahoo.patch
        9 kB
        Mahadev konar
      8. MAPREDUCE-1521-0.20-yahoo.patch
        3 kB
        Mahadev konar

        Issue Links

          Activity

          Hide
          Tom White added a comment -

          Patch no longer applies.

          Show
          Tom White added a comment - Patch no longer applies.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12471049/resourcestimator-overflow.txt
          against trunk revision 1075216.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/86//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12471049/resourcestimator-overflow.txt against trunk revision 1075216. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/86//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          0.20.100 branch contains these two other fixes as part of this patch.

          Show
          Todd Lipcon added a comment - 0.20.100 branch contains these two other fixes as part of this patch.
          Hide
          Mahadev konar added a comment -

          this patch is for trunk.

          Show
          Mahadev konar added a comment - this patch is for trunk.
          Hide
          Mahadev konar added a comment -

          modified patch so that the limit is a job limit than a cluster wide limit on the jobtracker. minor chnages to the code.

          Show
          Mahadev konar added a comment - modified patch so that the limit is a job limit than a cluster wide limit on the jobtracker. minor chnages to the code.
          Hide
          Mahadev konar added a comment -

          minor change to test case.

          Show
          Mahadev konar added a comment - minor change to test case.
          Hide
          Mahadev konar added a comment -

          this patch adds a test case.

          Show
          Mahadev konar added a comment - this patch adds a test case.
          Hide
          Mahadev konar added a comment -

          this patch adds some diagnostic information for the users on why the job failed. I am still adding a junit test for this patch. again this patch is for yahoo 0.20. Will upload a patch for trunk soon.

          Show
          Mahadev konar added a comment - this patch adds some diagnostic information for the users on why the job failed. I am still adding a junit test for this patch. again this patch is for yahoo 0.20. Will upload a patch for trunk soon.
          Hide
          Alex Loddengaard added a comment -

          I didn't review the patch, but I'm a strong +1 for the idea. Anything that gives admins more control is probably a good thing. I expect Allen will agree with me .

          Alex

          Show
          Alex Loddengaard added a comment - I didn't review the patch, but I'm a strong +1 for the idea. Anything that gives admins more control is probably a good thing. I expect Allen will agree with me . Alex
          Hide
          Mahadev konar added a comment -

          this patch adds a limit to amount of data that a reduce receives. If the estimated input data to the reduce is greater than the configured value at the jobtracker, the jobtracker fials the job. The configuration value is set to 0 (switched off) y default.

          This patch is for yahoo hadoop security branch. I will upload the patch for trunk soon.

          Show
          Mahadev konar added a comment - this patch adds a limit to amount of data that a reduce receives. If the estimated input data to the reduce is greater than the configured value at the jobtracker, the jobtracker fials the job. The configuration value is set to 0 (switched off) y default. This patch is for yahoo hadoop security branch. I will upload the patch for trunk soon.
          Hide
          Mahadev konar added a comment -

          I think Hongs idea seems reasonable.. and also todd is a valid point of disabling this feature by default.

          Show
          Mahadev konar added a comment - I think Hongs idea seems reasonable.. and also todd is a valid point of disabling this feature by default.
          Hide
          Hong Tang added a comment -

          We need to limit the amount of map output bytes or reduce input bytes as they must fit on a single physical node. We could then fail the job if a particular map or reduce violates such limits - assuming that reexecuting the map/reduce would lead to the same kind of violation.

          Show
          Hong Tang added a comment - We need to limit the amount of map output bytes or reduce input bytes as they must fit on a single physical node. We could then fail the job if a particular map or reduce violates such limits - assuming that reexecuting the map/reduce would lead to the same kind of violation.
          Hide
          Todd Lipcon added a comment -

          This makes me just a little bit nervous - the case I'm worried about is when a job is working fine in production and then starts failing at 3am some morning because the data volume increased just a little bit over the threshold.

          Could we default this behavior off and only turn it on for clusters where the operators prefer it?

          Show
          Todd Lipcon added a comment - This makes me just a little bit nervous - the case I'm worried about is when a job is working fine in production and then starts failing at 3am some morning because the data volume increased just a little bit over the threshold. Could we default this behavior off and only turn it on for clusters where the operators prefer it?
          Hide
          Allen Wittenauer added a comment -

          +10000

          A common conversation:

          Ops: "You have too many/too few reduces."

          User: "How many should I have?"

          Ops: "Uhhh...."

          Show
          Allen Wittenauer added a comment - +10000 A common conversation: Ops: "You have too many/too few reduces." User: "How many should I have?" Ops: "Uhhh...."
          Hide
          Arun C Murthy added a comment -

          Naive pig-scripts i.e. lack of understanding of PARALLEL feature of Pig, is a common culprit.

          Show
          Arun C Murthy added a comment - Naive pig-scripts i.e. lack of understanding of PARALLEL feature of Pig, is a common culprit.
          Hide
          Luke Lu added a comment -

          2a. Track output bytes per map and compute an extrapolated mapOutputBytes from a few randomly selected map output, so only a few maps needs to complete. We can have an expert mode to bypass the heuristic.

          Show
          Luke Lu added a comment - 2a. Track output bytes per map and compute an extrapolated mapOutputBytes from a few randomly selected map output, so only a few maps needs to complete. We can have an expert mode to bypass the heuristic.
          Hide
          Arun C Murthy added a comment -

          Some thoughts:

          1. Naive approach: Limit on inputBytes/#reduces - this doesn't handle blow-up or filtering of inputs
          2. Track #mapOutputBytes at JT and a limit on mapOutputBytes/#reduces - this still doesn't fail early i.e. lots of maps have to complete

          Maybe a hybrid? Possibly option1 with a flag which says '#reduces is small, and i really, really want it so'?

          Show
          Arun C Murthy added a comment - Some thoughts: Naive approach: Limit on inputBytes/#reduces - this doesn't handle blow-up or filtering of inputs Track #mapOutputBytes at JT and a limit on mapOutputBytes/#reduces - this still doesn't fail early i.e. lots of maps have to complete Maybe a hybrid? Possibly option1 with a flag which says '#reduces is small, and i really, really want it so'?

            People

            • Assignee:
              Mahadev konar
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:

                Development