Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18166

GeneralizedLinearRegression Wrong Value Range for Poisson Distribution

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.1.0
    • ML
    • None

    Description

      The current implementation of Poisson GLM seems to allow only positive values (See below). This is not correct since the support of Poisson includes the origin.

      override def initialize(y: Double, weight: Double): Double = {
      require(y > 0.0, "The response variable of Poisson family " +
      s"should be positive, but got $y")
      y
      }

      The fix is easy, just change it to
      require(y >= 0.0, "The response variable of Poisson family " +

      Attachments

        Activity

          People

            actuaryzhang Wayne Zhang
            actuaryzhang Wayne Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 10m
                10m
                Remaining:
                Remaining Estimate - 10m
                10m
                Logged:
                Time Spent - Not Specified
                Not Specified