Description
Found while answering Why does LogisticRegression fail with “IllegalArgumentException: org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7”? on StackOverflow.
When VectorAssembler is configured to use columns of unsupported type only the type is printed out without the column name(s).
The column name(s) should be included too.
// label is of StringType type val va = new VectorAssembler().setInputCols(Array("bc", "pmi", "label")) scala> va.transform(training) java.lang.IllegalArgumentException: Data type StringType is not supported. at org.apache.spark.ml.feature.VectorAssembler$$anonfun$transformSchema$1.apply(VectorAssembler.scala:121) at org.apache.spark.ml.feature.VectorAssembler$$anonfun$transformSchema$1.apply(VectorAssembler.scala:117) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.spark.ml.feature.VectorAssembler.transformSchema(VectorAssembler.scala:117) at org.apache.spark.ml.PipelineStage.transformSchema(Pipeline.scala:74) at org.apache.spark.ml.feature.VectorAssembler.transform(VectorAssembler.scala:54) ... 48 elided