Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18710

Add offset to GeneralizedLinearRegression models

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.2
    • 2.3.0
    • ML
    • Patch

    Description

      The current GeneralizedLinearRegression model does not support offset. The offset can be useful to take into account exposure, or for testing incremental effect of new variables. It is possible to use weights in current environment to achieve the same effect of specifying offset for certain models, e.g., Poisson & Binomial with log offset, it is desirable to have the offset option to work with more general cases, e.g., negative offset or offset that is hard to specify using weights (e.g., offset to the probability rather than odds in logistic regression).

      Effort would involve:

      • update regression class to support offsetCol
      • update IWLS to take into account of offset
      • add test case for offset

      I can start working on this if the community approves this feature.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            actuaryzhang Wayne Zhang Assign to me
            actuaryzhang Wayne Zhang
            Yanbo Liang Yanbo Liang
            Votes:
            1 Vote for this issue
            Watchers:
            4 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 10h
              10h
              Remaining:
              Remaining Estimate - 10h
              10h
              Logged:
              Time Spent - Not Specified
              Not Specified

              Slack

                Issue deployment