Uploaded image for project: 'Airavata'
  1. Airavata
  2. AIRAVATA-1651

Zookeeper connection lost error; Experiment failed

    XMLWordPrintableJSON

Details

    Description

      Two experiment has the same error message in log
      One experiment got FAILED at experiment level and no job status recorded.
      Other Experiment failed but the job got COMPLETE. Randomely occurs. was unable to recreate

      error messages retrived from log;
      2015-03-26 09:33:34,693 [main-SendThread(gw127.iu.xsede.org:9181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server gw127.iu.xsede.org/149.165.228.125:9181
      ...skipping...
      org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /gfac-experiments/gfac-node0/SLM-WRF-Stampede_c0697813-a8f4-4d8a-b0f3-6808f8538b18+IDontNeedaNode_a3b6133f-f8af-435d-9b2a-76838db535f6/org.apache.airavata.gfac.gsissh.handler.GSISSHInputHandler/state
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
      at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1228)
      at org.apache.airavata.gfac.core.utils.GFacUtils.updatePluginState(GFacUtils.java:1013)
      at org.apache.airavata.gfac.core.cpi.BetterGfacImpl.invokeInFlowHandlers(BetterGfacImpl.java:902)
      at org.apache.airavata.gfac.core.cpi.BetterGfacImpl.launch(BetterGfacImpl.java:690)
      at org.apache.airavata.gfac.core.cpi.BetterGfacImpl.submitJob(BetterGfacImpl.java:481)
      at org.apache.airavata.gfac.core.cpi.BetterGfacImpl.submitJob(BetterGfacImpl.java:210)
      at org.apache.airavata.gfac.core.utils.InputHandlerWorker.call(InputHandlerWorker.java:49)
      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)

      and

      aused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /gfac-experiments/gfac-node0/SLM-Trinity-Stampede_0bd73a38-6931-498f-af7b-d700dc177c43+IDontNeedaNode_db287294-796d-43c1-896d-e3b412b4c8a7/org.apache.airavata.gfac.ssh.handler.AdvancedSCPOutputHandler
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
      at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
      at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1031)
      at org.apache.airavata.gfac.core.utils.GFacUtils.createPluginZnode(GFacUtils.java:935)
      at org.apache.airavata.gfac.core.cpi.BetterGfacImpl.invokeOutFlowHandlers(BetterGfacImpl.java:939)

      Attachments

        Activity

          People

            shameera Shameera
            eroma_a Eroma
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: