Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18710

Add offset to GeneralizedLinearRegression models

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.2
    • Fix Version/s: 2.3.0
    • Component/s: ML
    • Labels:
    • Target Version/s:
    • Flags:
      Patch

      Description

      The current GeneralizedLinearRegression model does not support offset. The offset can be useful to take into account exposure, or for testing incremental effect of new variables. It is possible to use weights in current environment to achieve the same effect of specifying offset for certain models, e.g., Poisson & Binomial with log offset, it is desirable to have the offset option to work with more general cases, e.g., negative offset or offset that is hard to specify using weights (e.g., offset to the probability rather than odds in logistic regression).

      Effort would involve:

      • update regression class to support offsetCol
      • update IWLS to take into account of offset
      • add test case for offset

      I can start working on this if the community approves this feature.

        Attachments

          Activity

            People

            • Assignee:
              actuaryzhang Wayne Zhang
              Reporter:
              actuaryzhang Wayne Zhang
              Shepherd:
              Yanbo Liang
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 10h
                10h
                Remaining:
                Remaining Estimate - 10h
                10h
                Logged:
                Time Spent - Not Specified
                Not Specified