[SPARK-17797] getNumClasses support non-double datatypes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: ML
Labels:
None

Description

Without precomputed meta numValues, method Classifier.getNumClasses() do not support numeric types other than Double.

scala> val path = "/Users/zrf/.dev/spark-2.0.1-bin-hadoop2.7/data/mllib/sample_libsvm_data.txt"
path: String = /Users/zrf/.dev/spark-2.0.1-bin-hadoop2.7/data/mllib/sample_libsvm_data.txt

scala> val data = spark.read.format("libsvm").load(path).persist()
data: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [label: double, features: vector]

scala>

scala> val data2 = data.select(col("label").cast(LongType), col("features"))
data2: org.apache.spark.sql.DataFrame = [label: bigint, features: vector]

scala> val model = new NaiveBayes().fit(data)
model: org.apache.spark.ml.classification.NaiveBayesModel = NaiveBayesModel (uid=nb_1e27d7acf0b3) with 2 classes

scala> val model = new NaiveBayes().fit(data2)
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Double
  at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:114)
  at org.apache.spark.sql.Row$class.getDouble(Row.scala:242)
  at org.apache.spark.sql.catalyst.expressions.GenericRow.getDouble(rows.scala:192)
  at org.apache.spark.ml.classification.Classifier.getNumClasses(Classifier.scala:115)
  at org.apache.spark.ml.classification.NaiveBayes.train(NaiveBayes.scala:104)
  at org.apache.spark.ml.classification.NaiveBayes.train(NaiveBayes.scala:76)
  at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
  ... 54 elided

Attachments

Issue Links

duplicates

SPARK-17747 WeightCol support non-double datatypes

Resolved

links to

[Github] Pull Request #15372 (zhengruifeng)

Activity

People

Assignee:: Unassigned

Reporter:: Ruifeng Zheng

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 06/Oct/16 03:44

Updated:: 06/Oct/16 08:55

Resolved:: 06/Oct/16 08:55