Description
Currently none of the experiments are getting completed. Jobs on comet fails with error [1], experiment to launch jobs in BR2 seem to be blocked inEXECUTING, no significat errors found in logs but got [2] errors. Stampede experiments also failed with error similar to [1]
[1]
org.apache.airavata.gfac.core.GFacException: Error: userFriendly msg :Error while executing DATA_STAGING task, actual msg :expId: SLM1-CP2K-Stampede_9513859a-4b52-4d2c-8ae0-254abce28690, processId: PROCESS_dd45a568-9b3a-40b2-bd12-1a42b61c31a0, taskId: TASK_144e156d-5073-445f-abb7-74a9a57bd826, type: DATA_STAGING :- DATA_STAGING failed. Reason: Failed update experiment and process inputs and outputs at org.apache.airavata.gfac.impl.GFacEngineImpl.checkFailures(GFacEngineImpl.java:605) at org.apache.airavata.gfac.impl.GFacEngineImpl.inputDataStaging(GFacEngineImpl.java:586) at org.apache.airavata.gfac.impl.GFacEngineImpl.executeTaskListFrom(GFacEngineImpl.java:324) at org.apache.airavata.gfac.impl.GFacEngineImpl.executeProcess(GFacEngineImpl.java:263) at org.apache.airavata.gfac.impl.GFacWorker.executeProcess(GFacWorker.java:227) at org.apache.airavata.gfac.impl.GFacWorker.run(GFacWorker.java:86) at org.apache.airavata.common.logging.MDCUtil.lambda$wrapWithMDC$0(MDCUtil.java:21) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
[2]
2016-12-22 11:55:13 [pool-3-thread-4] ERROR o.a.a.gfac.server.GfacServerHandler - Error creating zookeeper nodes
2016-12-22 11:55:13 [pool-3-thread-4] INFO o.a.a.m.core.impl.ProcessConsumer - Message Received with message id 'LAUNCH.PROCESS-2b7ebd56-c012-44b3-bb7b-e16350164000 and with message type:LAUNCHPROCESS, for processId:PROCESS_a645ca16-df8b-4cbf-b054-233aef831f15, expId:SLM1-AmberSander-BR2_5b841cb6-9d82-4868-a322-4e1547205ed8
2016-12-22 11:55:13 [pool-3-thread-4] INFO o.a.a.gfac.server.GfacServerHandler - Message Received with message id LAUNCHPROCESS and with message type: {}LAUNCH.PROCESS-2b7ebd56-c012-44b3-bb7b-e16350164000
2016-12-22 11:55:14 [pool-3-thread-4] INFO o.a.a.gfac.impl.GFacEngineImpl - expId: SLM1-AmberSander-BR2_5b841cb6-9d82-4868-a322-4e1547205ed8, processId: PROCESS_a645ca16-df8b-4cbf-b054-233aef831f15, get process cancel data from zookeeper node /experiments/SLM1-AmberSander-BR2_5b841cb6-9d82-4868-a322-4e1547205ed8/PROCESS_a645ca16-df8b-4cbf-b054-233aef831f15/cancelListener
2016-12-22 11:55:15 [pool-3-thread-4] ERROR o.a.a.gfac.server.GfacServerHandler - Error creating zookeeper nodes