Description
I am trying to make a custom transformer use an externalResource, as it requires a large table to do the transformation. I'm not super familiar with scala syntax, but from snippets found on the internet I think I've made a proper java implementation. I am running into the following error:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Param HardMatchDetector_d95b8f699114__externalResource does not belong to HardMatchDetector_d95b8f699114.
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.ml.param.Params.shouldOwn(params.scala:851)
at org.apache.spark.ml.param.Params.set(params.scala:727)
at org.apache.spark.ml.param.Params.set$(params.scala:726)
at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
at org.apache.spark.ml.param.Params.set(params.scala:713)
at org.apache.spark.ml.param.Params.set$(params.scala:712)
at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
at HardMatchDetector.setResource(HardMatchDetector.java:45)
Code as follows:
public class HardMatchDetector extends Transformer implements DefaultParamsWritable, DefaultParamsReadable, Serializable { public String inputColumn = "value"; public String outputColumn = "hardMatches"; private ExternalResourceParam resourceParam = new ExternalResourceParam(this, "externalResource", "external resource, parquet file with 2 columns, one names and one wordcount");; private String uid; public HardMatchDetector setResource(final ExternalResource value) { return (HardMatchDetector)this.set(this.resourceParam, value); } public HardMatchDetector setResource(final String path) { return this.setResource(new ExternalResource(path, ReadAs.TEXT(), new HashMap())); } @Override public String uid() { return getUid(); } private String getUid() { if (uid == null) { uid = Identifiable$.MODULE$.randomUID("HardMatchDetector"); } return uid; } @Override public Dataset<Row> transform(final Dataset<?> dataset) { return dataset; } @Override public StructType transformSchema(StructType schema) { return schema.add(DataTypes.createStructField(outputColumn, DataTypes.StringType, true)); } @Override public Transformer copy(ParamMap extra) { return new HardMatchDetector(); } } public class HardMatcherTest extends AbstractSparkTest { @Test public void test() { var hardMatcher = new HardMatchDetector().setResource(pathName); } }