Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2547

The clustering documentaion example provided for spark 0.9.1/docs is having a error

    Details

      Description

      The documentation example for MLlib Clustering contains Kmeans example.

      http://spark.apache.org/docs/0.9.1/mllib-guide.html#clustering-2

      Here this line mentioned below is wrong and misleading.

      clusters = KMeans.train(parsedData, 2, maxIterations=10,runs=30, initialization_mode="random")

      Look at the key parameter "initialization_mode" given in example. Its wrong as per the implementation of KMeans. It should be "initializationMode"

      Correction:

      clusters = KMeans.train(parsedData, 2, maxIterations=10,runs=30, initializationMode="random")

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rahul1993 Rahul K Bhojwani
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified