Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
1.0.0
-
None
-
None
Description
When the number of clusters given to perform with org.apache.spark.mllib.clustering.KMeans under parallel initial mode is greater than data number, it will throw ArrayIndexOutOfBoundsException.
KMeans class should check the number of clusters that must not be greater than data number.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.spark.mllib.clustering.LocalKMeans$$anonfun$kMeansPlusPlus$1.apply$mcVI$sp(LocalKMeans.scala:62)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.mllib.clustering.LocalKMeans$.kMeansPlusPlus(LocalKMeans.scala:49)
at org.apache.spark.mllib.clustering.KMeans$$anonfun$20.apply(KMeans.scala:297)
at org.apache.spark.mllib.clustering.KMeans$$anonfun$20.apply(KMeans.scala:294)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.Range.foreach(Range.scala:141)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.mllib.clustering.KMeans.initKMeansParallel(KMeans.scala:294)
at org.apache.spark.mllib.clustering.KMeans.runBreeze(KMeans.scala:143)
at org.apache.spark.mllib.clustering.KMeans.run(KMeans.scala:126)
at org.apache.spark.examples.mllib.DenseKMeans$.run(DenseKMeans.scala:102)
at org.apache.spark.examples.mllib.DenseKMeans$$anonfun$main$1.apply(DenseKMeans.scala:72)
at org.apache.spark.examples.mllib.DenseKMeans$$anonfun$main$1.apply(DenseKMeans.scala:71)
at scala.Option.map(Option.scala:145)
at org.apache.spark.examples.mllib.DenseKMeans$.main(DenseKMeans.scala:71)
at org.apache.spark.examples.mllib.DenseKMeans.main(DenseKMeans.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Attachments
Issue Links
- duplicates
-
SPARK-1215 Clustering: Index out of bounds error
- Resolved
- is related to
-
SPARK-3218 K-Means clusterer can fail on degenerate data
- Resolved
- links to