Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17048

ML model read for custom transformers in a pipeline does not work

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: ML
    • Environment:

      Spark 2.0.0
      Java API

      Description

      0. Use Java API
      1. Create any custom ML transformer
      2. Make it MLReadable and MLWritable
      3. Add to pipeline
      4. Evaluate model, e.g. CrossValidationModel, and save results to disk
      5. For custom transformer you can use DefaultParamsReader and DefaultParamsWriter, for instance
      6. Load model from saved directory
      7. All out-of-the-box objects are loaded successfully, e.g. Pipeline, Evaluator, etc.
      8. Your custom transformer will fail with NPE

      Reason:
      ReadWrite.scala:447
      cls.getMethod("read").invoke(null).asInstanceOf[MLReader[T]].load(path)

      In Java this only works for static methods.
      As we are implementing MLReadable or MLWritable, then this call should be instance method call.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                taras.matyashovsky@gmail.com Taras Matyashovskyy
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 2h
                  2h
                  Remaining:
                  Remaining Estimate - 2h
                  2h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified