[SYSTEMDS-2018] Fixing Weight Decay Regularization in ADAM - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Algorithms
Labels:
None

Description

The common implementations of adaptive gradient algorithms, such
as Adam, limit the potential benefit of weight decay regularization, because the
weights do not decay multiplicatively (as would be expected for standard weight
decay) but by an additive constant factor.

This following paper found a way to fix regularization in Adam Optimization with one addition step(+ wx) to the gradient step :
https://arxiv.org/pdf/1711.05101.pdf

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Janardhan Pulivarthi

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Nov/17 08:19

Updated:: 20/Nov/17 19:03