Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1585

Not robust Lasso causes Infinity on weights and losses

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.1
    • Fix Version/s: 1.1.0
    • Component/s: MLlib
    • Labels:
      None

      Description

      Lasso uses LeastSquaresGradient and L1Updater, but

      diff = brzWeights.dot(brzData) - label

      in LeastSquaresGradient would cause too big diff, then will affect the L1Updater, which increases weights exponentially. Small shrinkage value cannot lasso weights back to zero then. Finally, the weights and losses reach Infinity.

      For example, data = (0.5 repeats 10k times), weights = (0.6 repeats 10k times), then data.dot(weights) approximates 300+, the diff will be 300. Then L1Updater sets weights to approximate 300. In the next iteration, the weights will be set to approximate 30000, and so on.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yinxusen Xusen Yin
                Reporter:
                yinxusen Xusen Yin
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved: