Mahout
  1. Mahout
  2. MAHOUT-975

Bug in Gradient Machine - Computation of the gradient

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.7
    • Fix Version/s: None
    • Component/s: Classification
    • Labels:
      None

      Description

      The initialisation to compute the gradient descent weight updates for the output units should be wrong:

      In the comment: "dy / dw is just w since y = x' * w + b."
      This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code.

      Check by using neural network terminology:

      The gradient machine is a specialized version of a multi layer perceptron (MLP).
      In a MLP the gradient for computing the "weight change" for the output units is:

      dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
      here: i index of the output layer; j index of the hidden layer
      (d stands for the partial derivatives)

      here: z_i = a_i (no squashing in the output layer)
      with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b
      with
      g index of output unit with target value: +1 (positive class)
      b: random output unit with target value: 0

      =>

      dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit
      j)
      dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit
      j)

      That's the same if the comment would be correct:
      dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
      the output unit with target value +1.

      ------------

      In neural network implementations it's common to compute the gradient
      numerically for a test of the implementation. This can be done by:
      dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

      1. GradientMachine2.java
        12 kB
        Ted Dunning
      2. MAHOUT-975.patch
        3 kB
        Suneel Marthi
      3. GradientMachine.patch
        3 kB
        Christian Herta

        Issue Links

          Activity

          Christian Herta created issue -
          Christian Herta made changes -
          Field Original Value New Value
          Status Open [ 1 ] Patch Available [ 10002 ]
          Christian Herta made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Christian Herta made changes -
          Attachment GradientMachine.patch [ 12513598 ]
          Christian Herta made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sebastian Schelter made changes -
          Fix Version/s 0.8 [ 12320153 ]
          Robin Anil made changes -
          Assignee Ted Dunning [ tdunning ]
          Suneel Marthi made changes -
          Attachment MAHOUT-975.patch [ 12587129 ]
          Ted Dunning made changes -
          Attachment GradientMachine2.java [ 12587165 ]
          Ted Dunning made changes -
          Fix Version/s Backlog [ 12318886 ]
          Fix Version/s 0.8 [ 12320153 ]
          Suneel Marthi made changes -
          Link This issue is superceded by MAHOUT-1265 [ MAHOUT-1265 ]
          Suneel Marthi made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Sebastian Schelter made changes -
          Fix Version/s Backlog [ 12318886 ]

            People

            • Assignee:
              Ted Dunning
              Reporter:
              Christian Herta
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development