Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12606

Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 1.5.2
    • None
    • ML
    • Java 8, Mac OS, Spark-1.5.2

    Description

      Hi Andy,

      I suspect that you hit the Scala/Java compatibility issue, I can also reproduce this issue, so could you file a JIRA to track this issue?

      Yanbo

      2016-01-02 3:38 GMT+08:00 Andy Davidson <Andy@santacruzintegration.com>:
      I am trying to write a trivial transformer I use use in my pipeline. I am using java and spark 1.5.2. It was suggested that I use the Tokenize.scala class as an example. This should be very easy how ever I do not understand Scala, I am having trouble debugging the following exception.

      Any help would be greatly appreciated.

      Happy New Year

      Andy

      java.lang.IllegalArgumentException: requirement failed: Param null__inputCol does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c.
      at scala.Predef$.require(Predef.scala:233)
      at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557)
      at org.apache.spark.ml.param.Params$class.set(params.scala:436)
      at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
      at org.apache.spark.ml.param.Params$class.set(params.scala:422)
      at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
      at org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83)
      at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30)

      public class StemmerTest extends AbstractSparkTest {
      @Test
      public void test()

      { Stemmer stemmer = new Stemmer() .setInputCol("raw”) //line 30 .setOutputCol("filtered"); }

      }

      /**

      @Override
      public String uid()

      { return uid; }

      /*
      override protected def validateInputType(inputType: DataType): Unit =

      { require(inputType == StringType, s"Input type must be string type but got $inputType.") }

      */
      @Override
      public void validateInputType(DataType inputTypeArg)

      { String msg = "inputType must be " + inputType.simpleString() + " but got " + inputTypeArg.simpleString(); assert (inputType.equals(inputTypeArg)) : msg; }

      @Override
      public Function1<List<String>, List<String>> createTransformFunc() {
      // http://stackoverflow.com/questions/6545066/using-scala-from-java-passing-functions-as-parameters
      Function1<List<String>, List<String>> f = new AbstractFunction1<List<String>, List<String>>() {
      public List<String> apply(List<String> words) {
      for(String word : words) {
      logger.error("AEDWIP input word: {}", word);
      }
      return words;
      }
      };

      return f;
      }

      @Override
      public DataType outputDataType()

      { return DataTypes.createArrayType(DataTypes.StringType, true); }

      }

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aedwip Andrew Davidson
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: