Uploaded image for project: 'CloudStack'
  1. CloudStack
  2. CLOUDSTACK-5499

Vmware -When nfs was down for about 12 hours and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 4.3.0
    • Future
    • Management Server
    • Security Level: Public (Anyone can view this level - this is the default.)
    • None
    • Build from 4.3

    Description

      Vmware -When nfs was down for about 12 hours and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.

      Set up :
      Advanced Zone with 2 5.1 ESXI hosts.

      Steps to reproduce the problem:

      1. Deploy 5 Vms in each of the hosts , so we start with 11 Vms.
      2. Start concurrent snapshots for ROOT volumes of all the Vms.
      3. Shutdown the Secondary storage server when the snapshots are in the progress.
      4. Bring the Secondary storage server up after 12 hours.

      Follwoing are the issues that are seen in this run:

      1. I see that the snapshots that are in Progress , report failures only after 12 hours even though the backup.snapshot.wait is set to 12 hours.

      2. New snapshot request that were executed when the NFS server was down , do not report failure immediately. In my case , i see that such request eventually succeeded when the NFS server was brought up. Is this the expected behavior ? Should we not expect to fail right away , instead of holding on to such active sessions ?

      3. Some of the snapshot failures resulted in snaphots that are in "CreatedOnPrimary" state. For such volumes , snapshots are not being attempted at all , even though the NFS server was brought up.

      Volumes in this state are - 16,18,17,22.

      There are instances where I have seen the snapshots being scheduled and succeeding even when the previous state was "CreatedOnPrimary". Why are were able to schedule snapshots in such cases ? And sometimes not in other cases?

      mysql> select volume_id,status,created from snapshots where volume_id=18;
      ----------------------------------------------

      volume_id status created

      ----------------------------------------------

      18 Destroyed 2013-12-12 23:24:14
      18 CreatedOnPrimary 2013-12-12 23:53:39
      18 BackedUp 2013-12-13 01:53:38
      18 CreatedOnPrimary 2013-12-13 03:53:38

      ----------------------------------------------

      mysql> select volume_id,status,created from snapshots;
      ----------------------------------------------

      volume_id status created

      ----------------------------------------------

      22 Destroyed 2013-12-12 23:24:13
      21 Destroyed 2013-12-12 23:24:13
      20 Destroyed 2013-12-12 23:24:14
      19 Destroyed 2013-12-12 23:24:14
      18 Destroyed 2013-12-12 23:24:14
      17 Destroyed 2013-12-12 23:24:14
      16 Destroyed 2013-12-12 23:24:14
      14 Destroyed 2013-12-12 23:24:15
      25 Destroyed 2013-12-12 23:24:15
      24 Destroyed 2013-12-12 23:24:15
      23 Destroyed 2013-12-12 23:24:15
      22 CreatedOnPrimary 2013-12-12 23:53:38
      21 Destroyed 2013-12-12 23:53:38
      20 Destroyed 2013-12-12 23:53:38
      19 Destroyed 2013-12-12 23:53:39
      18 CreatedOnPrimary 2013-12-12 23:53:39
      17 CreatedOnPrimary 2013-12-12 23:53:40
      16 CreatedOnPrimary 2013-12-12 23:53:40
      14 Destroyed 2013-12-12 23:53:40
      25 Destroyed 2013-12-12 23:53:41
      24 Destroyed 2013-12-12 23:53:41
      23 Destroyed 2013-12-12 23:53:42
      21 Destroyed 2013-12-13 00:53:37
      19 Destroyed 2013-12-13 00:53:38
      22 BackedUp 2013-12-13 01:53:37
      21 Destroyed 2013-12-13 01:53:38
      20 Destroyed 2013-12-13 01:53:38
      19 Destroyed 2013-12-13 01:53:38
      18 BackedUp 2013-12-13 01:53:38
      17 BackedUp 2013-12-13 01:53:38
      16 BackedUp 2013-12-13 01:53:39
      14 Destroyed 2013-12-13 01:53:39
      25 Destroyed 2013-12-13 01:53:39
      24 Destroyed 2013-12-13 01:53:39
      23 Destroyed 2013-12-13 01:53:40
      22 CreatedOnPrimary 2013-12-13 03:53:37
      21 Destroyed 2013-12-13 03:53:38
      20 Destroyed 2013-12-13 03:53:38
      19 Destroyed 2013-12-13 03:53:38
      18 CreatedOnPrimary 2013-12-13 03:53:38
      17 CreatedOnPrimary 2013-12-13 03:53:38
      16 CreatedOnPrimary 2013-12-13 03:53:39
      14 Destroyed 2013-12-13 03:53:39
      24 Destroyed 2013-12-13 08:53:37
      25 Destroyed 2013-12-13 09:53:37
      23 Destroyed 2013-12-13 10:53:37
      21 Destroyed 2013-12-13 16:53:37
      20 Destroyed 2013-12-13 16:53:38
      19 Destroyed 2013-12-13 16:53:38
      14 Destroyed 2013-12-13 16:53:38
      21 BackedUp 2013-12-13 18:53:37
      20 BackedUp 2013-12-13 18:53:38
      19 BackedUp 2013-12-13 18:53:38
      14 BackedUp 2013-12-13 18:53:38
      25 BackedUp 2013-12-13 18:53:38
      24 BackedUp 2013-12-13 18:53:38
      23 BackedUp 2013-12-13 18:53:39
      21 BackedUp 2013-12-13 19:53:37
      20 BackedUp 2013-12-13 19:53:38
      19 BackedUp 2013-12-13 19:53:38
      14 BackedUp 2013-12-13 19:53:38
      25 BackedUp 2013-12-13 19:53:38
      24 BackedUp 2013-12-13 19:53:39
      23 BackedUp 2013-12-13 19:53:39

      ----------------------------------------------

      Attachments

        1. nfs12down.rar
          2.07 MB
          Sangeetha Hariharan

        Activity

          People

            sateeshc Sateesh Chodapuneedi
            sangeethah Sangeetha Hariharan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: