Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2315

Change client connect zk service timeout log level from Info to Warn level

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.4.6
    • 3.4.7, 3.5.2, 3.6.0
    • java client
    • None

    Description

      Recently my the resourmanager of my hadoop cluster is fail suddenly,so I look into the rsourcemanager log.But the log is not helpful for me to direct find the reson until I found the zk timeout info log record.

      2015-11-06 06:34:11,257 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1446016482901_292094_01_000140 of capacity <memory:1024, vCores:1> on host mofa2089:41361, which has 30 containers, <memory:31744, vCores:30> used and <memory:9216, vCores:10> available after allocation
      2015-11-06 06:34:11,266 INFO org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x24f4fd5118e5c6e has expired, closing socket connection
      2015-11-06 06:34:11,271 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1446016482901_292094_01_000105 Container Transitioned from RUNNING to COMPLETED
      2015-11-06 06:34:11,271 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_1446016482901_292094_01_000105 in state: COMPLETED event:FINISHED
      2015-11-06 06:34:11,271 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dongwei  OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1446016482901_292094  CONTAINERID=container_1446016482901_292094_01_000105
      2015-11-06 06:34:11,271 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1446016482901_292094_01_000105 of capacity <memory:1024, vCores:1> on host mofa010079:50991, which currently has 29 containers, <memory:33792, vCores:29> used and <memory:7168, vCores:11> available, release resources=true
      2015-11-06 06:34:11,271 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1446016482901_292094_000001 released container container_1446016482901_292094_01_000105 on node: host: mofa010079:50991 #containers=29 available=<memory:7168, vCores:11> used=<memory:33792, vCores:29> with event: FINISHED
      2015-11-06 06:34:11,272 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1446016482901_292094_01_000141 Container Transitioned from NEW to ALLOCATED
      2015-11-06 06:34:11,272 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dongwei  OPERATION=AM Allocated Container        TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1446016482901_292094  CONTAINERID=container_1446016482901_292094_01_000141
      2015-11-06 06:34:11,272 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1446016482901_292094_01_000141 of capacity <memory:1024, vCores:1> on host mofa010079:50991, which has 30 containers, <memory:34816, vCores:30> used and <memory:6144, vCores:10> available after allocation
      2015-11-06 06:34:11,295 WARN org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher: org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread interrupted. Returning.
      2015-11-06 06:34:11,296 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032
      2015-11-06 06:34:11,297 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
      2015-11-06 06:34:11,297 INFO org.apache.hadoop.ipc.Server: Stopping server on 8030
      2015-11-06 06:34:11,297 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032
      2015-11-06 06:34:11,298 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
      2015-11-06 06:34:11,298 INFO org.apache.hadoop.ipc.Server: Stopping server on 8031
      2015-11-06 06:34:11,298 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8030
      2015-11-06 06:34:11,300 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 80312015-11-06 06:34:11,300 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
      

      The problem is solved,but it's too difficult to find the connect zk service time out info from so many info log records.And we will easily to ignore these records.So we should chang these zk seesion timeout log level form info level to warn.

      Attachments

        1. ZOOKEEPER-2315.001.patch
          1 kB
          Yiqun Lin
        2. ZOOKEEPER-2315.002.patch
          2 kB
          Yiqun Lin

        Activity

          People

            linyiqun Yiqun Lin
            linyiqun Yiqun Lin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: