Uploaded image for project: 'Airavata'
  1. Airavata
  2. AIRAVATA-2639

Experiment in EXECUTING state without progressing due a connection loss exception in GFACServerHandler

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.18
    • None
    • GFac
    • None

    Description

      In a sequential 50 test job submission one experiment is in EXECUTING without moving forward.
      Messages in the log indicates a connection loss.
      Log messages for this experiment from gfac log

      2018-01-10 16:36:18,610 [pool-3-thread-5] INFO o.a.a.m.c.impl.ProcessConsumer - Message Received with message id 'LAUNCH.PROCESS-fcc07285-e325-4c08-b31c-7347f4103efb and with message type:LAUNCHPROCESS, for processId:PROCESS_3eb71ab2-9c86-4e54-a529-9bbc51eecd5e, expId:SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f
      2018-01-10 16:36:45,557 [pool-3-thread-5] ERROR o.a.a.g.s.GfacServerHandler experiment_id=SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f, gateway_id=seagrid - KeeperErrorCode = ConnectionLoss for /experiments/SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f/PROCESS_3eb71ab2-9c86-4e54-a529-9bbc51eecd5e
      org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /experiments/SLM003-QEspresso-JS20_05520f07-54c6-4695-b164-a6e7a987777f/PROCESS_3eb71ab2-9c86-4e54-a529-9bbc51eecd5e
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
      at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
      at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:232)
      at org.apache.curator.utils.ZKPaths.mkdirs(ZKPaths.java:164)
      at org.apache.airavata.gfac.server.GfacServerHandler.createProcessZKNode(GfacServerHandler.java:350)
      at org.apache.airavata.gfac.server.GfacServerHandler.access$500(GfacServerHandler.java:76)
      at org.apache.airavata.gfac.server.GfacServerHandler$ProcessLaunchMessageHandler.onMessage(GfacServerHandler.java:233)
      at org.apache.airavata.messaging.core.impl.ProcessConsumer.handleDelivery(ProcessConsumer.java:81)
      at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:144)
      at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:99)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:748)

      Attachments

        Activity

          People

            shameera Shameera
            eroma_a Eroma
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: