Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
SystemDS 2.1
Description
Context:
- When removing/evicting a GPUObject(->GO), its memory is either freed or the pointer moved to a list of available allocations from which future GO can be served.
- The GPUMemoryManager(->GMM) fills its list of GO only in createGPUObject().
- In case of low memory, the GMM can evict/remove GO.
Fault: When a GO is removed from the GMM, the object is still referenced by its corresponding MatrixObject(->MO).
Effect: If the MO is used again (e.g., in a loop), the already existing GO is used and createGPUObject() is not called, so the GMM is unaware of a GO in use. If memory is low, a new allocation can not be made and the ghost GO might receive memory from the list of available "allocations to be reused". Effectively this ghost GO steals memory from the manager because if another eviction is necessary, that ghost GO is not a candidate because the GMM is not aware of it. If more GO in our loop do the same, the GMM will eventually run out of memory to hand out.
Attachments
Issue Links
- links to