Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17692

Document ML/MLlib behavior changes in Spark 2.1

    Details

    • Type: Documentation
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.1.0
    • Component/s: ML, MLlib
    • Labels:

      Description

      This JIRA records behavior changes of ML/MLlib between 2.0 and 2.1, so we can note those changes (if any) in the user guide's Migration Guide section. If you found one, please comment below and link the corresponding JIRA here.

      • SPARK-17389: Reduce KMeans default k-means|| init steps to 2 from 5.
      • SPARK-17870: ChiSquareSelector use pValue rather than raw statistic for SelectKBest features.
      • SPARK-3261: KMeans returns potentially fewer than k cluster centers in cases where k distinct centroids aren't available or aren't selected.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                yanboliang Yanbo Liang
                Reporter:
                yanboliang Yanbo Liang
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: