Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5856

Counter limits always use defaults even if JobClient is given a different Configuration

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 2.3.0, 2.4.0
    • Fix Version/s: None
    • Component/s: client
    • Labels:
      None

      Description

      If you have a job with more than the default number of counters (i.e. > 120), and you create a JobClient with a Configuration where the default is increased (e.g. 500), then JobClient will throw this Exception:

      org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 121 max=120
      

        Issue Links

          Activity

          Hide
          Robert Kanter added a comment -

          The problem is that Limits is looking for the mapred-site.xml in the classpath, or failing that, the defaults. This is a problem for clients such as Oozie where we don't have mapred-site.xml on the classpath, and instead create a JobClient by passing it a Configuration object to use instead (where the counters limit is increased).

          Show
          Robert Kanter added a comment - The problem is that Limits is looking for the mapred-site.xml in the classpath, or failing that, the defaults. This is a problem for clients such as Oozie where we don't have mapred-site.xml on the classpath, and instead create a JobClient by passing it a Configuration object to use instead (where the counters limit is increased).
          Hide
          Robert Kanter added a comment -

          The fix is very simple; we just need to call Limits.init(...) from JobClient.init(...).

          I couldn't do a unit test because the Configuration used by Limits is only loaded once, but I did verify that it fixes the problem and the fix is pretty trivial.

          Show
          Robert Kanter added a comment - The fix is very simple; we just need to call Limits.init(...) from JobClient.init(...) . I couldn't do a unit test because the Configuration used by Limits is only loaded once, but I did verify that it fixes the problem and the fix is pretty trivial.
          Hide
          Jason Lowe added a comment -

          One related issue with allowing jobs to increase the default is that it can blow the memory on the history server which caches recent jobs. In other words, a few jobs with huge number of counters (and a correspondingly huge AM heaps to handle them) might run OK but then later cause an OOM on the historyserver as it tries to handle all those jobs.

          Show
          Jason Lowe added a comment - One related issue with allowing jobs to increase the default is that it can blow the memory on the history server which caches recent jobs. In other words, a few jobs with huge number of counters (and a correspondingly huge AM heaps to handle them) might run OK but then later cause an OOM on the historyserver as it tries to handle all those jobs.
          Hide
          Robert Kanter added a comment -

          This won't remove the limit checking; it's still enforced. The patch just makes it so that someone using a mapred-site.xml not on the classpath is still able to change the counter limit. Without this, clients such as Oozie have no way of using a different limit.

          I imagine that the history server loads it's mapred-site.xml from the classpath anyway, so this patch won't affect it at all. Even without the patch, if the counter limit is set too high, then it could get an OOM. But that's the user's fault.

          Show
          Robert Kanter added a comment - This won't remove the limit checking; it's still enforced. The patch just makes it so that someone using a mapred-site.xml not on the classpath is still able to change the counter limit. Without this, clients such as Oozie have no way of using a different limit. I imagine that the history server loads it's mapred-site.xml from the classpath anyway, so this patch won't affect it at all. Even without the patch, if the counter limit is set too high, then it could get an OOM. But that's the user's fault.
          Hide
          Robert Kanter added a comment -

          Another way of looking at this: if I write the following code in my program:

          Configuration conf = new Configuration();
          conf.set(...);  // Set a bunch of properties or load from some files
          JobClient client = new JobClient(conf);
          

          The expected behavior is that everything client does will use the properties I set in conf. If it doesn't, then that's a bug.

          Show
          Robert Kanter added a comment - Another way of looking at this: if I write the following code in my program: Configuration conf = new Configuration(); conf.set(...); // Set a bunch of properties or load from some files JobClient client = new JobClient(conf); The expected behavior is that everything client does will use the properties I set in conf . If it doesn't, then that's a bug.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12641599/MAPREDUCE-5856.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4552//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4552//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12641599/MAPREDUCE-5856.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4552//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4552//console This message is automatically generated.
          Hide
          Karthik Kambatla (Inactive) added a comment -

          One related issue with allowing jobs to increase the default is that it can blow the memory on the history server which caches recent jobs.

          Fair point. To address this in a compatible way, we can add a global max for the number of counters that the admin can set. We can do this in a separate JIRA.

          Show
          Karthik Kambatla (Inactive) added a comment - One related issue with allowing jobs to increase the default is that it can blow the memory on the history server which caches recent jobs. Fair point. To address this in a compatible way, we can add a global max for the number of counters that the admin can set. We can do this in a separate JIRA.
          Hide
          Robert Kanter added a comment -

          MAPREDUCE-5875 fixes this use case, as well as some others.

          Show
          Robert Kanter added a comment - MAPREDUCE-5875 fixes this use case, as well as some others.

            People

            • Assignee:
              Robert Kanter
              Reporter:
              Robert Kanter
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development