Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19498

Discussion: Making MLlib APIs extensible for 3rd party libraries

    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Resolved
    • Critical
    • Resolution: Incomplete
    • 2.2.0
    • None
    • ML

    Description

      Per the recent discussion on the dev list, this JIRA is for discussing how we can make MLlib DataFrame-based APIs more extensible, especially for the purpose of writing 3rd-party libraries with APIs extended from the MLlib APIs (for custom Transformers, Estimators, etc.).

      • For people who have written such libraries, what issues have you run into?
      • What APIs are not public or extensible enough? Do they require changes before being made more public?
      • Are APIs for non-Scala languages such as Java and Python friendly or extensive enough?

      The easy answer is to make everything public, but that would be terrible of course in the long-term. Let's discuss what is needed and how we can present stable, sufficient, and easy-to-use APIs for 3rd-party developers.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              josephkb Joseph K. Bradley
              Votes:
              2 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: