Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
Description
Scenario:
Application had 7 operators. Operator 2 and 3 are connected using THREAD_LOCAL.
Observations:
- After launch, physical plan shows operator ids 1,2 and 4,5,6,7,8.
- Operator which has id=3 in its deploy request is not identified by stram. Stram shows id=8 for this operator.
- When stram receives Heartbeat from operator=3 it could not recognize this operator. It gives following message in the log.
INFO com.datatorrent.stram.StreamingContainerManager: Heartbeat for unknown operator 3 - Operator id=3 gets an undeploy request signal.
Relevant logs from Stram:
2015-10-09 04:31:01,765 INFO com.datatorrent.stram.StreamingContainerParent: child msg: [container_1443694550865_0150_01_000007] Entering heartbeat loop.. context: PTContainer[id=2(container_1443694550865_0150_01_000007),state=ALLOCATED,operators=[PTOperator[id=2,name=BlockReader], PTOperator[id=3,name=BlockWriter]]] 2015-10-09 04:31:02,779 INFO com.datatorrent.stram.StreamingContainerManager: Container container_1443694550865_0150_01_000007 buffer server: node32.morado.com:38536 2015-10-09 04:31:06,749 INFO com.datatorrent.stram.StreamingContainerManager: Heartbeat for unknown operator 3 (container container_1443694550865_0150_01_000007) context: PTContainer[id=2(container_1443694550865_0150_01_000007),state=ACTIVE,operators=[PTOperator[id=2,name=BlockReader], PTOperator[id=8,name=BlockWriter]]] 2015-10-09 04:31:41,091 INFO com.datatorrent.stram.StreamingAppMasterService: Completed containerId=container_1443694550865_0150_01_000007, state=COMPLETE, exitStatus=1, diagnostics=Exception from container-launch.