Uploaded image for project: 'Stratos'
  1. Stratos
  2. STRATOS-706

member terminate event should log reason

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0.0
    • Fix Version/s: FUTURE
    • Component/s: Autoscaler
    • Labels:
      None

      Description

      When Stratos terminates a member it must log the reason for it. Ideally the logging should be systematic enough so that one can grep for different severity, or by member, or by event type or some other useful categorization.
      The justification for this defect is that it will improve greatly debugging and troubleshooting capabilities. Without logging it is very difficult to debug terminations of members.

      For example, consider this sequence in the stratos log file:

      ===================
      TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG

      {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - Received an instance spawn request : MemberContext [memberId=null, nodeId=null, clusterId=cisco-gilan-appmgr-1.cisco-gil, cartridgeType=null, privateIpAddress=null, publicIpAddress=null, allocatedIpAddress=null, initTime=1405418328649, lbClusterId=null, networkPartitionId=OAM1] {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}

      TID: [0] [STRATOS] [2014-07-15 09:58:48,654] DEBUG

      {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - Payload: SERVICE_NAME=cisco-gilan-appmgr,HOST_NAME=cisco-gilan-appmgr-1.qmog.cisco.com,MULTITENANT=false,TENANT_ID=-1234,TENANT_RANGE=-1234,CARTRIDGE_ALIAS=cisco-gilan-appmgr-1,CLUSTER_ID=cisco-gilan-appmgr-1.cisco-gil,CARTRIDGE_KEY=o1jbiPPmPWBgyNVM,DEPLOYMENT=default,REPO_URL=null,PORTS=9482,PUPPET_IP=PUPPET_IP,PUPPET_HOSTNAME=PUPPET_HOSTNAME,PUPPET_ENV=PUPPET_ENV,HEARTBEAT_AUTHKEY=20c9629a87f53ecdb5278d2ddb5a9d42,TRUSTSTORE_PASSWORD=wso2carbon,CEP_PORT=7611,MONITORING_SERVER_SECURE_PORT=0,MB_PORT=61616,OPENSTACK_COMPUTE_DNS=10.58.10.82,MB_IP=octl-01.qmog.cisco.com,QSB_PUPPET_ENVIR=,CEP_IP=octl-01.qmog.cisco.com,VSM_USER=admin,VEM_IP=192.168.66.43,ENABLE_DATA_PUBLISHER=false,MONITORING_SERVER_ADMIN_PASSWORD=xxxx,MONITORING_SERVER_IP=octl-01.qmog.cisco.com,VEM_USER=ubuntu,VEM_PWD=ubuntu,COMMIT_ENABLED=false,MONITORING_SERVER_ADMIN_USERNAME=xxxx,CERT_TRUSTSTORE=/opt/apache-stratos-cartridge-agent/security/client-truststore.jks,VSM_PWD=Starent123!,VSM_IP=192.168.66.2,MONITORING_SERVER_PORT=0,APPMGR_GITREPO=ssh://jenapper@10.58.10.189/home/jenapper/code/eccentrica.git,MEMBER_ID=cisco-gilan-appmgr-1.cisco-gil7ef7327f-2bb2-4768-820f-d064de29aa59,LB_CLUSTER_ID=null,NETWORK_PARTITION_ID=OAM1,PARTITION_ID=RegionOne-AZ-1 {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}

      TID: [0] [STRATOS] [2014-07-15 09:58:55,888] INFO

      {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - Member is terminated: MemberContext [memberId=cisco-gilan-appmgr-1.cisco-gil407f5bdc-aad2-4234-80fc-6cdf17be6192, nodeId=RegionOne/89433818-21ed-48d4-bd8f-c396ab30f6d2, clusterId=cisco-gilan-appmgr-1.cisco-gil, cartridgeType=cisco-gilan-appmgr, privateIpAddress=192.168.66.1, publicIpAddress=null, allocatedIpAddress=null, initTime=1405417410736, lbClusterId=null, networkPartitionId=OAM1] {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}

      ===================

      The problem is that Stratos gives no indication of why it is doing this [1]. Stratos should be enhanced so that the above message gives some indication of why the member is being terminated (loss of heartbeats, timeout on port knocking etc. etc.). This is needed as apache stratos expands it's user base.
      This issue has high priority as it affects the efficiency of troubleshooting and system stability.

        Attachments

          Activity

            People

            • Assignee:
              vishanth Vishanth
              Reporter:
              meppel Martin Eppel
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Due:
                Created:
                Updated: