Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Done
-
2.4.0
-
None
-
None
Description
While running the code, PowerIterationClustering in spark ML throws exception.
val data = spark.createDataFrame(Seq( (0, Array(1), Array(0.9)), (1, Array(2), Array(0.9)), (2, Array(3), Array(0.9)), (3, Array(4), Array(0.1)), (4, Array(5), Array(0.9)) )).toDF("id", "neighbors", "similarities") val result = new PowerIterationClustering() .setK(2) .setMaxIter(10) .setInitMode("random") .transform(data) .select("id","prediction")
org.apache.spark.sql.AnalysisException: cannot resolve '`prediction`' given input columns: [id, neighbors, similarities];; 'Project [id#215, 'prediction] +- AnalysisBarrier +- Project [id#215, neighbors#216, similarities#217] +- Join Inner, (id#215 = id#234) :- Project [_1#209 AS id#215, _2#210 AS neighbors#216, _3#211 AS similarities#217] : +- LocalRelation [_1#209, _2#210, _3#211] +- Project [cast(id#230L as int) AS id#234] +- LogicalRDD [id#230L, prediction#231], false at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:88) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:85) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:289) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:288)
Attachments
Issue Links
- relates to
-
SPARK-15784 Add Power Iteration Clustering to spark.ml
- Resolved
- links to