Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.18
-
None
-
None
Description
Issue: Gromacs working directory was not archived in comet.
User runs several jobs that run for more than a week (two ran for 8 days and other two ran for 7 days). At the end of the job, all the output files defined was transferred but the working directory was not archived and transferred back. GFAC commands of archive task [1]
Clone_of_E-258-98-46-Graphene-35a_31c61f89-b3c6-4d20-b7cb-d16b214bd225
Same application which ran for 3 days, have transferred the ARCHIVE. GFAC commands of archive task [2]
E-295-110-70-27_6e955784-8506-4050-8488-058ecf510342
It seems like that GFAC didn't execute the command to archive in the first scenario.
[1]
2017-09-10 03:53:52,262 [pool-11-thread-68] INFO o.a.a.g.c.context.TaskContext - expId: Clone_of_E-258-98-46-Graphene-35a_31c61f89-b3c6-4d20-b7cb-d16b214bd225, processId: PROCESS_ed7717f6-c6b5-4853-825f-46ff47d11286, taskId: TASK_cdbdf58d-1fc4-48c2-a9e7-bd0b2fe37d98, type: DATA_STAGING : Task status changed EXECUTING -> COMPLETED
2017-09-10 03:53:52,266 [pool-11-thread-68] INFO o.a.a.g.c.c.ProcessContext - expId: Clone_of_E-258-98-46-Graphene-35a_31c61f89-b3c6-4d20-b7cb-d16b214bd225, processId: PROCESS_ed7717f6-c6b5-4853-825f-46ff47d11286 :- Process status changed OUTPUT_DATA_STAGING -> OUTPUT_DATA_STAGING
2017-09-10 03:53:52,271 [pool-11-thread-68] INFO o.a.a.g.c.context.TaskContext - expId: Clone_of_E-258-98-46-Graphene-35a_31c61f89-b3c6-4d20-b7cb-d16b214bd225, processId: PROCESS_ed7717f6-c6b5-4853-825f-46ff47d11286, taskId: TASK_eeb81f24-3843-4e63-a19c-7ecef8ad8627, type: DATA_STAGING : Task status changed CREATED -> EXECUTING
2017-09-10 03:53:52,275 [pool-11-thread-68] INFO o.a.a.g.c.context.TaskContext - expId: Clone_of_E-258-98-46-Graphene-35a_31c61f89-b3c6-4d20-b7cb-d16b214bd225, processId: PROCESS_ed7717f6-c6b5-4853-825f-46ff47d11286, taskId: TASK_eeb81f24-3843-4e63-a19c-7ecef8ad8627, type: DATA_STAGING : Task status changed EXECUTING -> COMPLETED
2017-09-10 03:53:52,279 [pool-11-thread-68] INFO o.a.a.g.c.c.ProcessContext - expId: Clone_of_E-258-98-46-Graphene-35a_31c61f89-b3c6-4d20-b7cb-d16b214bd225, processId: PROCESS_ed7717f6-c6b5-4853-825f-46ff47d11286 :- Process status changed OUTPUT_DATA_STAGING -> COMPLETED
[2]
2017-09-01 16:49:27,673 [pool-11-thread-104] INFO o.a.a.g.c.c.ProcessContext - expId: E-295-110-70-27_6e955784-8506-4050-8488-058ecf510342, processId: PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc :- Process status changed OUTPUT_DATA_STAGING -> OUTPUT_DATA_STAGING
2017-09-01 16:49:27,686 [pool-11-thread-104] INFO o.a.a.g.c.context.TaskContext - expId: E-295-110-70-27_6e955784-8506-4050-8488-058ecf510342, processId: PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc, taskId: TASK_ae760886-9358-4375-ae76-d9075dfb3804, type: DATA_STAGING : Task status changed CREATED -> EXECUTING
2017-09-01 16:49:27,697 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - SSH Session validation succeeded, key :gridchem_comet.sdsc.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,220 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - Channel creation test succeeded, key :gridchem_comet.sdsc.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,220 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - Reuse SSH session for :gridchem_comet.sdsc.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,223 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - SSH Session validation succeeded, key :pga_gf4.ucs.indiana.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,236 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - Channel creation test succeeded, key :pga_gf4.ucs.indiana.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,236 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - Reuse SSH session for :pga_gf4.ucs.indiana.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,236 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - SSH Session validation succeeded, key :gridchem_comet.sdsc.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,565 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - Channel creation test succeeded, key :gridchem_comet.sdsc.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,565 [pool-11-thread-104] INFO o.a.airavata.gfac.impl.Factory - Reuse SSH session for :gridchem_comet.sdsc.edu_22_3d65bf6d-2c9f-4166-a51b-e76e0022bd3b
2017-09-01 16:49:28,617 [pool-11-thread-104] INFO o.a.a.g.impl.HPCRemoteCluster - Executing command cd /oasis/scratch/comet/gridchem/temp_project/seagrid_workdirs/PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc && tar -cvf /oasis/scratch/comet/gridchem/temp_project/seagrid_workdirs/PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc/archive.tar ./*
2017-09-01 17:08:13,771 [pool-11-thread-104] INFO o.a.a.g.c.context.TaskContext - expId: E-295-110-70-27_6e955784-8506-4050-8488-058ecf510342, processId: PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc, taskId: TASK_ae760886-9358-4375-ae76-d9075dfb3804, type: DATA_STAGING : Task status changed EXECUTING -> COMPLETED
2017-09-01 17:08:13,801 [pool-11-thread-104] INFO o.a.a.g.c.c.ProcessContext - expId: E-295-110-70-27_6e955784-8506-4050-8488-058ecf510342, processId: PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc :- Process status changed OUTPUT_DATA_STAGING -> COMPLETED
2017-09-01 17:08:13,818 [pool-11-thread-104] INFO o.a.a.gfac.impl.GFacWorker - expId: E-295-110-70-27_6e955784-8506-4050-8488-058ecf510342, processId: PROCESS_e87e6fe9-9b66-474b-b791-af56f86bd7bc :- Sent ack for deliveryTag 130
Attachments
Attachments
Issue Links
- is related to
-
AIRAVATA-1904 ARCHIVE did not happen in recovery
-
- Open
-