Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-22198

Java incompatibility when extending UnaryTransformer or Transformer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.2.0
    • None
    • Java API, ML

    Description

      It is not possible to create proper Java custom Transformer by extending UnaryTransformer or Transformer.

      The built-in Params (e.g. defined in HasInputColumn trait) cannot be used and custom Params cannot be added.

      It seems that the method 'uid()' is called during object creation before the provided 'uid' constructor parameter could be set.

      This leads to the following error:

      java.lang.IllegalArgumentException: requirement failed: Param <prefix>1563950936fa_inputCol does not belong to <prefix>_d4105b75c4aa.

      If you extend UnaryTransformer and try to use it e.g. through CrossValidator, you will need to explicitly include a constructor, which receives a String parameter. As I saw in the source of built in transformers, this parameter is a 'uid', which should be set in the object. However, it is not possible to do it in time, because the uid() method is invoked (and its result might be used) before this constructor finishes.

      Sample class:

      public class TextCleaner extends UnaryTransformer<String, String, TextCleaner>
      implements Serializable, DefaultParamsWritable, DefaultParamsReadable<TextCleaner> {

      private static final long serialVersionUID = 2658543236303100458L;

      private static final String sparkUidPrefix = "TextCleaner";

      private final String sparkUid;

      public TextCleaner()

      Unknown macro: { sparkUid = org.apache.spark.ml.util.Identifiable$.MODULE$.randomUID(sparkUidPrefix); }

      public TextCleaner(String uid)

      Unknown macro: { sparkUid = uid; }

      @Override
      public String uid()

      Unknown macro: { // This method is called by parent class, before object creation finishes return sparkUid; }

      ...

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              akos.tomasits Akos Tomasits
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: