Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18710

Add offset to GeneralizedLinearRegression models

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.2
    • 2.3.0
    • ML
    • Patch

    Description

      The current GeneralizedLinearRegression model does not support offset. The offset can be useful to take into account exposure, or for testing incremental effect of new variables. It is possible to use weights in current environment to achieve the same effect of specifying offset for certain models, e.g., Poisson & Binomial with log offset, it is desirable to have the offset option to work with more general cases, e.g., negative offset or offset that is hard to specify using weights (e.g., offset to the probability rather than odds in logistic regression).

      Effort would involve:

      • update regression class to support offsetCol
      • update IWLS to take into account of offset
      • add test case for offset

      I can start working on this if the community approves this feature.

      Attachments

        Activity

          People

            actuaryzhang Wayne Zhang
            actuaryzhang Wayne Zhang
            Yanbo Liang Yanbo Liang
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 10h
                10h
                Remaining:
                Remaining Estimate - 10h
                10h
                Logged:
                Time Spent - Not Specified
                Not Specified