Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18686

Several cleanup and improvements for spark.logit

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • ML, SparkR
    • None

    Description

      Several cleanup and improvements for spark.logit:

      • summary should return coefficients matrix, and should output labels for each class if the model is multinomial logistic regression model.
      • summary should not return areaUnderROC, roc, pr, ..., since most of them are DataFrame which are less important for R users. Meanwhile, these metrics ignore instance weights (setting all to 1.0) which will be changed in later Spark version. In case it will introduce breaking changes, we do not expose them currently.
      • SparkR test improvement: comparing the training result with native R glmnet.
      • Remove argument aggregationDepth from spark.logit, since it's an expert Param(related with Spark architecture and job execution) that would be used rarely by R users.

      Attachments

        Activity

          People

            yanboliang Yanbo Liang
            yanboliang Yanbo Liang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: