Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2629

LimitExceededException in Tez client when DAG has exceeds the default max counters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.5.0, 0.6.0, 0.7.0
    • 0.8.0-alpha, 0.7.1, 0.5.5, 0.6.3
    • None
    • None

    Description

      Original issue was HIVE-11303, seeing LimitExceededException when the client tries to get the counters for a completed job:

      2015-07-17 18:18:11,830 INFO  [main]: counters.Limits (Limits.java:ensureInitialized(59)) - Counter limits initialized with parameters:  GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=1200
      2015-07-17 18:18:11,841 ERROR [main]: exec.Task (TezTask.java:execute(189)) - Failed to execute tez graph.
      org.apache.tez.common.counters.LimitExceededException: Too many counters: 1201 max=1200
              at org.apache.tez.common.counters.Limits.checkCounters(Limits.java:87)
              at org.apache.tez.common.counters.Limits.incrCounters(Limits.java:94)
              at org.apache.tez.common.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:76)
              at org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:93)
              at org.apache.tez.common.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:104)
              at org.apache.tez.dag.api.DagTypeConverters.convertTezCountersFromProto(DagTypeConverters.java:567)
              at org.apache.tez.dag.api.client.DAGStatus.getDAGCounters(DAGStatus.java:148)
              at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175)
              at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
              at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
              at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1673)
              at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1432)
              at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1213)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1064)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
              at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
              at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
              at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
              at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
              at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
              at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
              at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
              at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
              at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:497)
              at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
      

      It looks like Limits.ensureInitialized() is defaulting to an empty configuration, resulting in COUNTERS_MAX being set to the default of 1200 (even though Hive's configuration specified tez.counters.max=16000).

      Per sseth:

      I think the Tez client does need to make this call to setup the Configuration correctly. We do this for the AM and the executing task - which is why it works. Could you please open a Tez jira for this ?
      Also, Limits is making use of Configuration instead of TezConfiguration for default initialization, which implies changes to tez-site on the local node won't be picked up.

      Attachments

        1. TEZ-2629_branch05_1.txt
          10 kB
          Siddharth Seth
        2. TEZ-2629_branch06_1.txt
          10 kB
          Siddharth Seth
        3. TEZ-2629_branch07_1.txt
          10 kB
          Siddharth Seth
        4. TEZ-2629.1.txt
          10 kB
          Siddharth Seth

        Issue Links

          Activity

            People

              sseth Siddharth Seth
              jdere Jason Dere
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: