Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19647

Logging cleanups; emit regionname when RegionTooBusyException inside RetriesExhausted... make netty connect/disconnect TRACE-level

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0-beta-1, 2.0.0
    • None
    • None

    Description

      In MR failures, i see this:

      Error: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 365 actions: RegionTooBusyException: 365 times, servers with issues: ve0534.halxg.cloudera.com,16020,1514392912363 at org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:54) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:491) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:268) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:225) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.persist(IntegrationTestBigLinkedList.java:541) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:464) at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:399) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) [2017-12-27 09:09:11.570]Container killed by the ApplicationMaster. [2017-12-27 09:09:11.586]Container killed on request. Exit code is 143 [2017-12-27 09:09:11.600]Container exited with a non-zero exit code 143.

      Its missing the region name which is in the root exception, RegionTooBusyException – we just skip it.

      Attachments

        1. addendum.patch
          7 kB
          Michael Stack
        2. 19647.patch
          19 kB
          Michael Stack

        Issue Links

          Activity

            People

              stack Michael Stack
              stack Michael Stack
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: