Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Using Brooklyn 0.10.0-SNAPSHOT (e.g. brooklyn-server commit 66b9b1c)...
When rebinding to existing persisted state, it failed to create the enricher ServiceStateLogic$ComputeServiceState with the NPE shown below (but then continued, so this is otherwise benign):
2016-09-09 09:31:03,850 WARN o.a.b.c.m.r.RebindExceptionHandlerImpl [brooklyn-execmanager-VtZheMDn-0]: problem adding enricher cdoy70m1hv (ServiceFailureDetector{name=org.apache.brooklyn.policy.ha.ServiceFailureDetector, uniqueTag=service.state.actual, running=true, entity=VanillaSoftwareProcessImpl{id=pwz8z4pbyp}, id=cdoy70m1hv}) to entity pwz8z4pbyp (VanillaSoftwareProcessImpl{id=pwz8z4pbyp}); continuing java.lang.NullPointerException: null at org.apache.brooklyn.policy.ha.ServiceFailureDetector.setActualState(ServiceFailureDetector.java:188) ~[brooklyn-policy-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.entity.lifecycle.ServiceStateLogic$ComputeServiceState.onEvent(ServiceStateLogic.java:288) ~[brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.policy.ha.ServiceFailureDetector.onEvent(ServiceFailureDetector.java:159) ~[brooklyn-policy-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.entity.lifecycle.ServiceStateLogic$ComputeServiceState.setEntity(ServiceStateLogic.java:274) ~[brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.entity.AbstractEntity$BasicEnricherSupport.add(AbstractEntity.java:1788) ~[brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.BasicEntityRebindSupport.addEnrichers(BasicEntityRebindSupport.java:145) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.BasicEntityRebindSupport.addEnrichers(BasicEntityRebindSupport.java:47) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.RebindIteration.associateAdjunctsWithEntities(RebindIteration.java:650) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.RebindIteration.doRun(RebindIteration.java:244) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.InitialFullRebindIteration.doRun(InitialFullRebindIteration.java:69) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.RebindIteration.run(RebindIteration.java:266) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.RebindManagerImpl.rebindImpl(RebindManagerImpl.java:558) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:508) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.core.mgmt.rebind.RebindManagerImpl$3.call(RebindManagerImpl.java:506) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at org.apache.brooklyn.util.core.task.BasicExecutionManager$SubmissionCallable.call(BasicExecutionManager.java:519) [brooklyn-core-0.10.0-20160907.0931.jar:0.10.0-20160907.0931] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_95] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_95] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_95] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_95]
The line throwing the NPE is executing setEntityOnFireTime = now + getConfig(SERVICE_ON_FIRE_STABILIZATION_DELAY).toMilliseconds(). That suggests we got null for the config value, even though there is a default value of zero for it.
The enricher persisted state contains:
<serviceOnFire.stabilizationDelay> <org.apache.brooklyn.camp.brooklyn.spi.dsl.methods.DslComponent_-DslConfigSupplier> <component> <componentId></componentId> <scope>THIS</scope> </component> <keyName>swarm.recovery.stabilizationDelay</keyName> </org.apache.brooklyn.camp.brooklyn.spi.dsl.methods.DslComponent_-DslConfigSupplier> </serviceOnFire.stabilizationDelay>
And the owning entity has:
brooklyn.parameters: - name: swarm.recovery.stabilizationDelay label: Stabilization Delay description: | Time period for which the service must be consistently in the same state to trigger an action # A restart shouldn't trigger failure type: org.apache.brooklyn.util.time.Duration default: 5m
There are a few things we should think about fixing:
1. guard against the NPE in ServiceFailureDetector.setActualState (i.e. handle when the config returns null) - but unclear what value it should then default to.
2. avoid calling this code on rebind. for example, instead of ComputeServiceState.setEntity immediately calling onEvent(null), it could subscribe with "notifyOfInitialValue" so that it gets a callback (in the right thread, at the right time).
3. investigate further why the config lookup returned null - e.g. is it because entity wasn't fully initialised, or because the DSL didn't find the default value defined in brooklyn.parameters?
Attachments
Issue Links
- links to