Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35193

Scala/Java compatibility issue Re: how to use externalResource in java transformer from Scala Transformer?

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 3.1.1
    • None
    • Java API, ML
    • None

    Description

      I am trying to make a custom transformer use an externalResource, as it requires a large table to do the transformation. I'm not super familiar with scala syntax, but from snippets found on the internet I think I've made a proper java implementation. I am running into the following error:

      Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Param HardMatchDetector_d95b8f699114__externalResource does not belong to HardMatchDetector_d95b8f699114.
      at scala.Predef$.require(Predef.scala:281)
      at org.apache.spark.ml.param.Params.shouldOwn(params.scala:851)
      at org.apache.spark.ml.param.Params.set(params.scala:727)
      at org.apache.spark.ml.param.Params.set$(params.scala:726)
      at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
      at org.apache.spark.ml.param.Params.set(params.scala:713)
      at org.apache.spark.ml.param.Params.set$(params.scala:712)
      at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
      at HardMatchDetector.setResource(HardMatchDetector.java:45)

       

      Code as follows:

      public class HardMatchDetector extends Transformer implements DefaultParamsWritable, DefaultParamsReadable, Serializable {
      public String inputColumn = "value";
       public String outputColumn = "hardMatches";
       private ExternalResourceParam resourceParam = new ExternalResourceParam(this, "externalResource", "external resource, parquet file with 2 columns, one names and one wordcount");;
       private String uid;
      public HardMatchDetector setResource(final ExternalResource value)
      { return (HardMatchDetector)this.set(this.resourceParam, value); }
      public HardMatchDetector setResource(final String path)
      { return this.setResource(new ExternalResource(path, ReadAs.TEXT(), new HashMap())); }
      @Override
       public String uid()
      { return getUid(); }
      private String getUid() {
       if (uid == null)
      { uid = Identifiable$.MODULE$.randomUID("HardMatchDetector"); }
      return uid;
       }
      @Override
       public Dataset<Row> transform(final Dataset<?> dataset)
      { return dataset; }
      @Override
       public StructType transformSchema(StructType schema)
      { return schema.add(DataTypes.createStructField(outputColumn, DataTypes.StringType, true)); }
      @Override
       public Transformer copy(ParamMap extra)
      { return new HardMatchDetector(); }
      }
      public class HardMatcherTest extends AbstractSparkTest
      { @Test 
      public void test() 
      { 
      var hardMatcher = new HardMatchDetector().setResource(pathName); }
      }
      

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            ArthurJochems Arthur
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: