Uploaded image for project: 'jclouds'
  1. jclouds
  2. JCLOUDS-1329

Azure ARM extraneous resources are not cleaned up on node deletion

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.2
    • Fix Version/s: 2.1.0, 2.0.3
    • Component/s: None
    • Labels:
      None

      Description

      This is because doDestroyNode returns null once a node has been deleted so there is no node metadata to use to cleanup the extraneous resources.
      This is particularly problematic on Azure ARM as the resource limits are quite low be default for some of these resources so you can quickly get into a state where you can't deploy nodes.
      I've created a PR based on equivalent GCE compute code to fix this.
      https://github.com/jclouds/jclouds-labs/pull/409

        Activity

        Hide
        andreaturli Andrea Turli added a comment - - edited

        Thanks Duncan Grant

        Reading https://github.com/jclouds/jclouds-labs/blob/master/azurecompute-arm/src/main/java/org/jclouds/azurecompute/arm/compute/AzureComputeServiceAdapter.java#L330 I'd have expected `cleanupResources.cleanupNode(id)` to be able to delete the vm and most of the resources associated.
        Then the `AdaptingComputeServiceStrategies.destroyNode` to collect the NodeMetadata before the deletion and to return it to be used in cleanUpIncidentalResourcesOfDeadNodes.

        When are you seeing this problem, during normal flow or in exceptional situations?

        Show
        andreaturli Andrea Turli added a comment - - edited Thanks Duncan Grant Reading https://github.com/jclouds/jclouds-labs/blob/master/azurecompute-arm/src/main/java/org/jclouds/azurecompute/arm/compute/AzureComputeServiceAdapter.java#L330 I'd have expected `cleanupResources.cleanupNode(id)` to be able to delete the vm and most of the resources associated. Then the `AdaptingComputeServiceStrategies.destroyNode` to collect the NodeMetadata before the deletion and to return it to be used in cleanUpIncidentalResourcesOfDeadNodes. When are you seeing this problem, during normal flow or in exceptional situations?
        Hide
        duncanjg Duncan Grant added a comment -

        Andrea Turli I see this during normal flow.
        I think that you are correct that `AdaptingComputeServiceStrategies.destroyNode` will return the node metadata but during the retry step in 'BaseComputeService.doDestroyNode' the metadata gets lost. This happens during the nodeTerminated.apply(node) call which sets the metadata in the "node" AtomicReference to the metadata returned from the api which happens to be null. I really struggle with the guice injection stuff so I can't seem to find the exact bit of code where this happens. Let me know if you want to screen share and we could step through with a debugger.
        thanks

        Show
        duncanjg Duncan Grant added a comment - Andrea Turli I see this during normal flow. I think that you are correct that `AdaptingComputeServiceStrategies.destroyNode` will return the node metadata but during the retry step in 'BaseComputeService.doDestroyNode' the metadata gets lost. This happens during the nodeTerminated.apply(node) call which sets the metadata in the "node" AtomicReference to the metadata returned from the api which happens to be null. I really struggle with the guice injection stuff so I can't seem to find the exact bit of code where this happens. Let me know if you want to screen share and we could step through with a debugger. thanks
        Show
        duncanjg Duncan Grant added a comment - I've debugged through and it happens here: https://github.com/duncangrant/jclouds/blob/9cdd53b0b7a87fa26a77b9ce370882f2a9cc7d71/compute/src/main/java/org/jclouds/compute/predicates/internal/TrueIfNullOrDeletedRefreshAndDoubleCheckOnFalse.java#L47
        Hide
        andreaturli Andrea Turli added a comment -

        Interesting point about the retry step in `BaseComputeService.doDestroyNode`, I think this is a smell of a problem with the core if ARM and GCE needed to "fix" the default behavior, but something for another issue, cc Ignasi Barrera

        Show
        andreaturli Andrea Turli added a comment - Interesting point about the retry step in `BaseComputeService.doDestroyNode`, I think this is a smell of a problem with the core if ARM and GCE needed to "fix" the default behavior, but something for another issue, cc Ignasi Barrera
        Show
        duncanjg Duncan Grant added a comment - Fixed by https://github.com/jclouds/jclouds-labs/pull/409

          People

          • Assignee:
            Unassigned
            Reporter:
            duncanjg Duncan Grant
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development