Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17692

Document ML/MLlib behavior changes in Spark 2.1

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • ML, MLlib

    Description

      This JIRA records behavior changes of ML/MLlib between 2.0 and 2.1, so we can note those changes (if any) in the user guide's Migration Guide section. If you found one, please comment below and link the corresponding JIRA here.

      • SPARK-17389: Reduce KMeans default k-means|| init steps to 2 from 5.
      • SPARK-17870: ChiSquareSelector use pValue rather than raw statistic for SelectKBest features.
      • SPARK-3261: KMeans returns potentially fewer than k cluster centers in cases where k distinct centroids aren't available or aren't selected.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yanboliang Yanbo Liang
            yanboliang Yanbo Liang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment