Description
Params.validateParams() can not be called automatically currently. Such as the following code snippet will not throw exception which is not as expected.
val df = sqlContext.createDataFrame( Seq( (1, Vectors.dense(0.0, 1.0, 4.0), 1.0), (2, Vectors.dense(1.0, 0.0, 4.0), 2.0), (3, Vectors.dense(1.0, 0.0, 5.0), 3.0), (4, Vectors.dense(0.0, 0.0, 5.0), 4.0)) ).toDF("id", "features", "label") val scaler = new MinMaxScaler() .setInputCol("features") .setOutputCol("features_scaled") .setMin(10) .setMax(0) val pipeline = new Pipeline().setStages(Array(scaler)) pipeline.fit(df)
validateParams() should be called by PipelineStage(Pipeline/Estimator/Transformer) automatically, so I propose to put it in transformSchema().