Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-2915

Lost GPUObjects memory leak

    XMLWordPrintableJSON

Details

    Description

      Context:

      • When removing/evicting a GPUObject(->GO), its memory is either freed or the pointer moved to a list of available allocations from which future GO can be served.
      • The GPUMemoryManager(->GMM) fills its list of GO only in createGPUObject().
      • In case of low memory, the GMM can evict/remove GO.

      Fault: When a GO is removed from the GMM, the object is still referenced by its corresponding MatrixObject(->MO).

      Effect: If the MO is used again (e.g., in a loop), the already existing GO is used and createGPUObject() is not called, so the GMM is unaware of a GO in use. If memory is low, a new allocation can not be made and the ghost GO might receive memory from the list of available "allocations to be reused". Effectively this ghost GO steals memory from the manager because if another eviction is necessary, that ghost GO is not a candidate because the GMM is not aware of it. If more GO in our loop do the same, the GMM will eventually run out of memory to hand out.

       

      Attachments

        Issue Links

          Activity

            People

              markd Mark Dokter
              markd Mark Dokter
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: