Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
4.0.0, 4.1.0 Alpha
-
None
-
None
-
Openstack Icehouse, Stratos 4.0.0 GA
Description
On my setup with Icehouse and Stratos 4.0.0GA, I observed there was one particular cartridge with one running instance and multiple instances in ERROR state. Upon checking wso2carbon.log, I found several instances of the exception below. Looked like when Stratos launched the cartridge, the instance didn't achieve running state, so Stratos tried to launch another instance. This kept on going until eventually one instance of the cartridge achieved running status.
We need to make sure when this condition occurs, Stratos will remove the instances that are in ERROR state before attempting to re-launch. The instances in ERROR state can exhaust resources on the underline Iaas cluster (Openstack in this case)
1) IllegalStateException on node RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10:java.lang.IllegalStateException: node(RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10) didn't achieve the status running; aborting after 1 seconds with final status: ERROR
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:72)
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:45) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:121) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:146) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:53)
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:711) at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:849) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
1 error[s] at org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient.spawnAnInstance(CloudControllerClient.java:174)
at org.apache.stratos.autoscaler.rule.RuleTasksDelegator.delegateSpawn(RuleTasksDelegator.java:87)
... 22 moreCaused by: org.apache.axis2.AxisFault: Failed to start an instance. MemberContext [memberId=lb01.lb01.domaincde68af9-d82f-42c3-9c01-4fd8da565867, nodeId=null, clusterId=lb01.lb01.domain, cartridgeType=lb01, privateIpAddresses=null, publicIpAddresses=null, allocatedIpAddress=null, initTime=1427177492261, lbClusterId=null, networkPartitionId=N1] Cause: error running 1 node group(lb01lb01) location(RegionOne) image(0299be4d-a743-4424-ae28-f40bd4faa669) size(2e2e2b47-9f40-4bd8-9777-83e802f5f1cd) options({inboundPorts=[], autoAssignFloatingIp=false, securityGroupNames=[default], keyPairName=phoenix, userData=[B@14edd531, configDrive=false, novaNetworks=[Network
Execution failures:
0 error[s]Node failures:
1) IllegalStateException on node RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10:java.lang.IllegalStateException: node(RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10) didn't achieve the status running; aborting after 1 seconds with final status: ERROR
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:72)
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:45) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:121) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:146) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:53)
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:711) at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:849) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
1 error[s]
at org.apache.axis2.util.Utils.getInboundFaultFromMessageContext(Utils.java:531)
at org.apache.axis2.description.OutInAxisOperationClient.handleResponse(OutInAxisOperation.java:370)
at org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:445)
at org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225)
at org.apache.axis2.client.OperationClient.execute(OperationClient.java:149)
at org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.startInstance(CloudControllerServiceStub.java:1407) at org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient.spawnAnInstance(CloudControllerClient.java:162)
... 23 moreTID: [0] [STRATOS] [2015-03-24 06:13:04,511] INFO {org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient} - Trying to spawn an instance via cloud controller: [cluster] lb01.lb01.domain [partition] RegionOne-Core [lb-cluster] null [network-partition-id] N1 {org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient}TID: [0] [STRATOS] [2015-03-24 06:13:09,114] INFO {org.wso2.carbon.databridge.core.DataBridge} - admin connected {org.wso2.carbon.databridge.core.DataBridge}TID: [0] [STRATOS] [2015-03-24 06:13:16,902] INFO {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - Instance is successfully starting up. MemberContext [memberId=lb01.lb01.domain64df2bd7-ee48-4ca2-9dec-362b72543d86, nodeId=RegionOne/aa1a0e56-a722-444a-99f5-080ef844fb2d, clusterId=lb01.lb01.domain, cartridgeType=lb01, privateIpAddresses=null, publicIpAddresses=null, allocatedIpAddress=null, initTime=1427177584511, lbClusterId=null, networkPartitionId=N1] {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}
at org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:674)
at org.drools.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:230)
at org.apache.stratos.autoscaler.rule.AutoscalerRuleEvaluator.evaluateMinCheck(AutoscalerRuleEvaluator.java:94)
at org.apache.stratos.autoscaler.monitor.ClusterMonitor.monitor(ClusterMonitor.java:157)
at org.apache.stratos.autoscaler.monitor.ClusterMonitor.run(ClusterMonitor.java:86)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.axis2.AxisFault: Failed to start an instance. MemberContext [memberId=lb01.lb01.domaincde68af9-d82f-42c3-9c01-4fd8da565867, nodeId=null, clusterId=lb01.lb01.domain, cartridgeType=lb01, privateIpAddresses=null, publicIpAddresses=null, allocatedIpAddress=null, initTime=1427177492261, lbClusterId=null, networkPartitionId=N1] Cause: error running 1 node group(lb01lb01) location(RegionOne) image(0299be4d-a743-4424-ae28-f40bd4faa669) size(2e2e2b47-9f40-4bd8-9777-83e802f5f1cd) options({inboundPorts=[], autoAssignFloatingIp=false, securityGroupNames=[default], keyPairName=phoenix, userData=[B@14edd531, configDrive=false, novaNetworks=[Network{networkUuid=42c4a88d-0d59-4fbb-90f0-9b9806f9c17c, portUuid=null, fixedIp=172.16.2.201}
, Network
{networkUuid=6a2615e4-760c-4c93-895d-b4b16e550193, portUuid=null, fixedIp=10.81.69.201}, Network{networkUuid=670550f0-67fc-48ff-a33c-e184a7908247, portUuid=null, fixedIp=10.13.5.81}]})Execution failures:
0 error[s]
Node failures:
1) IllegalStateException on node RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10:
java.lang.IllegalStateException: node(RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10) didn't achieve the status running; aborting after 1 seconds with final status: ERROR
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:72)
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:45)
at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:121)
at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:146)
at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:53)
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:711)
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:849)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
1 error[s]
at org.apache.axis2.util.Utils.getInboundFaultFromMessageContext(Utils.java:531)
at org.apache.axis2.description.OutInAxisOperationClient.handleResponse(OutInAxisOperation.java:370)
at org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:445)
at org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225)
at org.apache.axis2.client.OperationClient.execute(OperationClient.java:149)
at org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.startInstance(CloudControllerServiceStub.java:1407)
at org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient.spawnAnInstance(CloudControllerClient.java:162)
... 23 moreTID: [0] [STRATOS] [2015-03-24 06:11:34,509] ERROR {org.apache.stratos.autoscaler.monitor.ClusterMonitor} - Cluster monitor: Monitor failed.ClusterMonitor [clusterId=lb01.lb01.domain, serviceId=lb01, deploymentPolicy=Deployment Policy [id]static-1-Core [partitions] [org.apache.stratos.cloud.controller.stub.deployment.partition.Partition@fb6144], autoscalePolicy=ASPolicy [id=economyPolicy, displayName=null, description=null], lbReferenceType=null, hasPrimary=false ] {org.apache.stratos.autoscaler.monitor.ClusterMonitor}Exception executing consequence for rule "Minimum Rule" in org.apache.stratos.autoscaler.rule: java.lang.RuntimeException: cannot invoke method: delegateSpawn at org.drools.runtime.rule.impl.DefaultConsequenceExceptionHandler.handleException(DefaultConsequenceExceptionHandler.java:39)
at org.drools.common.DefaultAgenda.fireActivation(DefaultAgenda.java:1297)
at org.drools.common.DefaultAgenda.fireNextItem(DefaultAgenda.java:1221)
at org.drools.common.DefaultAgenda.fireAllRules(DefaultAgenda.java:1456)
at org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:710)
at org.drools.common.AbstractWorkingMemory.fireAllRules(AbstractWorkingMemory.java:674)
at org.drools.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:230)
at org.apache.stratos.autoscaler.rule.AutoscalerRuleEvaluator.evaluateMinCheck(AutoscalerRuleEvaluator.java:94)
at org.apache.stratos.autoscaler.monitor.ClusterMonitor.monitor(ClusterMonitor.java:157)
at org.apache.stratos.autoscaler.monitor.ClusterMonitor.run(ClusterMonitor.java:86)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: cannot invoke method: delegateSpawn
at org.mvel2.optimizers.impl.refl.nodes.MethodAccessor.getValue(MethodAccessor.java:63)
at org.mvel2.optimizers.impl.refl.nodes.VariableAccessor.getValue(VariableAccessor.java:37)
at org.mvel2.ast.ASTNode.getReducedValueAccelerated(ASTNode.java:108) at org.mvel2.MVELRuntime.execute(MVELRuntime.java:85)
at org.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
at org.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119) at org.mvel2.MVEL.executeExpression(MVEL.java:930) at org.drools.base.mvel.MVELConsequence.evaluate(MVELConsequence.java:104) at org.drools.common.DefaultAgenda.fireActivation(DefaultAgenda.java:1287)
... 9 moreCaused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.mvel2.optimizers.impl.refl.nodes.MethodAccessor.getValue(MethodAccessor.java:48)
... 17 more
Caused by: java.lang.RuntimeException: Cannot spawn an instance
at org.apache.stratos.autoscaler.rule.RuleTasksDelegator.delegateSpawn(RuleTasksDelegator.java:107)
... 22 moreCaused by: org.apache.stratos.autoscaler.exception.SpawningException: Failed to start an instance. MemberContext [memberId=lb01.lb01.domaincde68af9-d82f-42c3-9c01-4fd8da565867, nodeId=null, clusterId=lb01.lb01.domain, cartridgeType=lb01, privateIpAddresses=null, publicIpAddresses=null, allocatedIpAddress=null, initTime=1427177492261, lbClusterId=null, networkPartitionId=N1] Cause: error running 1 node group(lb01lb01) location(RegionOne) image(0299be4d-a743-4424-ae28-f40bd4faa669) size(2e2e2b47-9f40-4bd8-9777-83e802f5f1cd) options({inboundPorts=[], autoAssignFloatingIp=false, securityGroupNames=[default], keyPairName=phoenix, userData=[B@14edd531, configDrive=false, novaNetworks=[Network{networkUuid=42c4a88d-0d59-4fbb-90f0-9b9806f9c17c, portUuid=null, fixedIp=172.16.2.201}, Network{networkUuid=6a2615e4-760c-4c93-895d-b4b16e550193, portUuid=null, fixedIp=10.81.69.201}
, Network
{networkUuid=670550f0-67fc-48ff-a33c-e184a7908247, portUuid=null, fixedIp=10.13.5.81}]})Execution failures:
0 error[s]
Node failures:
1) IllegalStateException on node RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10:java.lang.IllegalStateException: node(RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10) didn't achieve the status running; aborting after 1 seconds with final status: ERROR
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:72)
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:45) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:121) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:146) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:53)
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:711) at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:849) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
1 error[s] at org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient.spawnAnInstance(CloudControllerClient.java:174)
at org.apache.stratos.autoscaler.rule.RuleTasksDelegator.delegateSpawn(RuleTasksDelegator.java:87)
... 22 moreCaused by: org.apache.axis2.AxisFault: Failed to start an instance. MemberContext [memberId=lb01.lb01.domaincde68af9-d82f-42c3-9c01-4fd8da565867, nodeId=null, clusterId=lb01.lb01.domain, cartridgeType=lb01, privateIpAddresses=null, publicIpAddresses=null, allocatedIpAddress=null, initTime=1427177492261, lbClusterId=null, networkPartitionId=N1] Cause: error running 1 node group(lb01lb01) location(RegionOne) image(0299be4d-a743-4424-ae28-f40bd4faa669) size(2e2e2b47-9f40-4bd8-9777-83e802f5f1cd) options({inboundPorts=[], autoAssignFloatingIp=false, securityGroupNames=[default], keyPairName=phoenix, userData=[B@14edd531, configDrive=false, novaNetworks=[Network{networkUuid=42c4a88d-0d59-4fbb-90f0-9b9806f9c17c, portUuid=null, fixedIp=172.16.2.201}, Network{networkUuid=6a2615e4-760c-4c93-895d-b4b16e550193, portUuid=null, fixedIp=10.81.69.201}, Network{networkUuid=670550f0-67fc-48ff-a33c-e184a7908247, portUuid=null, fixedIp=10.13.5.81}
]})
Execution failures:
0 error[s]Node failures:
1) IllegalStateException on node RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10:java.lang.IllegalStateException: node(RegionOne/887351d5-2c16-48b9-927e-0ca5f13fcc10) didn't achieve the status running; aborting after 1 seconds with final status: ERROR
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:72)
at org.jclouds.compute.functions.PollNodeRunning.apply(PollNodeRunning.java:45) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.call(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:121) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:146) at org.jclouds.compute.strategy.CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.apply(CustomizeNodeAndAddToGoodMapOrPutExceptionIntoBadMap.java:53)
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:711) at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:849) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
1 error[s]
at org.apache.axis2.util.Utils.getInboundFaultFromMessageContext(Utils.java:531)
at org.apache.axis2.description.OutInAxisOperationClient.handleResponse(OutInAxisOperation.java:370)
at org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:445)
at org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225)
at org.apache.axis2.client.OperationClient.execute(OperationClient.java:149)
at org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.startInstance(CloudControllerServiceStub.java:1407) at org.apache.stratos.autoscaler.client.cloud.controller.CloudControllerClient.spawnAnInstance(CloudControllerClient.java:162)
... 23 more
TID: [0] [STRATOS] [2015-03-24 06:13:04,511] INFO
TID: [0] [STRATOS] [2015-03-24 06:13:09,114] INFO
{org.wso2.carbon.databridge.core.DataBridge} - admin connected {org.wso2.carbon.databridge.core.DataBridge}TID: [0] [STRATOS] [2015-03-24 06:13:16,902] INFO
{org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl} - Instance is successfully starting up. MemberContext [memberId=lb01.lb01.domain64df2bd7-ee48-4ca2-9dec-362b72543d86, nodeId=RegionOne/aa1a0e56-a722-444a-99f5-080ef844fb2d, clusterId=lb01.lb01.domain, cartridgeType=lb01, privateIpAddresses=null, publicIpAddresses=null, allocatedIpAddress=null, initTime=1427177584511, lbClusterId=null, networkPartitionId=N1] {org.apache.stratos.cloud.controller.impl.CloudControllerServiceImpl}