Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17797

getNumClasses support non-double datatypes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ML
    • Labels:
      None

      Description

      Without precomputed meta numValues, method Classifier.getNumClasses() do not support numeric types other than Double.

      scala> val path = "/Users/zrf/.dev/spark-2.0.1-bin-hadoop2.7/data/mllib/sample_libsvm_data.txt"
      path: String = /Users/zrf/.dev/spark-2.0.1-bin-hadoop2.7/data/mllib/sample_libsvm_data.txt
      
      scala> val data = spark.read.format("libsvm").load(path).persist()
      data: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [label: double, features: vector]
      
      scala>
      
      scala> val data2 = data.select(col("label").cast(LongType), col("features"))
      data2: org.apache.spark.sql.DataFrame = [label: bigint, features: vector]
      
      scala> val model = new NaiveBayes().fit(data)
      model: org.apache.spark.ml.classification.NaiveBayesModel = NaiveBayesModel (uid=nb_1e27d7acf0b3) with 2 classes
      
      scala> val model = new NaiveBayes().fit(data2)
      java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Double
        at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:114)
        at org.apache.spark.sql.Row$class.getDouble(Row.scala:242)
        at org.apache.spark.sql.catalyst.expressions.GenericRow.getDouble(rows.scala:192)
        at org.apache.spark.ml.classification.Classifier.getNumClasses(Classifier.scala:115)
        at org.apache.spark.ml.classification.NaiveBayes.train(NaiveBayes.scala:104)
        at org.apache.spark.ml.classification.NaiveBayes.train(NaiveBayes.scala:76)
        at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
        ... 54 elided
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                podongfeng zhengruifeng
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: