[MAHOUT-975] Bug in Gradient Machine - Computation of the gradient - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Won't Fix
Affects Version/s: 0.7
Fix Version/s: 0.10.0
Component/s: None
Labels:
None

Description

The initialisation to compute the gradient descent weight updates for the output units should be wrong:

In the comment: "dy / dw is just w since y = x' * w + b."
This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code.

Check by using neural network terminology:

The gradient machine is a specialized version of a multi layer perceptron (MLP).
In a MLP the gradient for computing the "weight change" for the output units is:

dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
here: i index of the output layer; j index of the hidden layer
(d stands for the partial derivatives)

here: z_i = a_i (no squashing in the output layer)
with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b
with
g index of output unit with target value: +1 (positive class)
b: random output unit with target value: 0

dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit
j)
dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit
j)

That's the same if the comment would be correct:
dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
the output unit with target value +1.

------------

In neural network implementations it's common to compute the gradient
numerically for a test of the implementation. This can be done by:
dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

GradientMachine.patch
07/Feb/12 12:01
3 kB
Christian Herta
GradientMachine2.java
10/Jun/13 23:51
12 kB
Ted Dunning
MAHOUT-975.patch
10/Jun/13 20:56
3 kB
Suneel Marthi

Issue Links

is superceded by

MAHOUT-1265 Add Multilayer Perceptron

Closed

Activity

People

Assignee:: Ted Dunning

Reporter:: Christian Herta

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 07/Feb/12 11:54

Updated:: 31/Jan/24 22:12

Resolved:: 02/Mar/14 21:13