Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
This ticket is going to update one API in autograd.py
def backward(y, dy=None) # returns the gradient tensor one by one using yield def gradients(y, dy=None) # returns a dictionary: param tensor -> gradient tensor
With the backward() API, we can update the param immediately after its gradient is available. Then, the gradient tensor can be deleted and the memory is released.
The gradients() API keeps all gradient tensors in the memory, which incurs memory overhead.