Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15526

Shade JPMML

    XMLWordPrintableJSON

Details

    • Dependency upgrade
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.0.0
    • 2.3.0
    • ML, MLlib
    • None

    Description

      The Spark-MLlib module depends on the JPMML-Model library (org.jpmml:pmml-model:1.2.7) for its PMML export capabilities. The JPMML-Model library is included in the Apache Spark assembly, which makes it very difficult to build and deploy competing PMML exporters that may wish to depend on different versions (typically much newer) of the same library.

      JPMML-Model library classes are not part of Apache Spark public APIs, so it shouldn't be a problem if they are relocated by prepending a prefix "org.spark_project" to their package names using Maven Shade Plugin. The requested treatment is identical to how Google Guava and Jetty dependencies are shaded in the final assembly.

      This issue is raised in relation to the JPMML-SparkML project (https://github.com/jpmml/jpmml-sparkml), which provides PMML export capabilities for Spark ML Pipelines. Currently, application developers who wish to use it must tweak their application classpath, which assumes familiarity with build internals.

      Attachments

        Activity

          People

            srowen Sean R. Owen
            vfed Villu Ruusmann
            Votes:
            4 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified