Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15069

GSoC 2016: Exposing more R and Python APIs for MLlib

    Details

    • Type: Umbrella
    • Status: Closed
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ML, PySpark, SparkR
    • Labels:

      Description

      This issue is for tracking the Google Summer of Code 2016 project for Kai Jiang: "Apache Spark: Exposing more R and Python APIs for MLlib"

      See attached proposal for details. Note that the tasks listed in the proposal are tentative and can adapt as the community works on these various parts of MLlib.

      This umbrella will contain links for tasks included in this project, to be added as each task begins.

        Issue Links

          Activity

          Hide
          josephkb Joseph K. Bradley added a comment -

          5/23/2016 - Week 1

          Initial items

          • QA
          • Ping on (SPARK-15439)
          • If time permits, begin work on Decision Tree API for SparkR
            • Create JIRA
            • Propose API
            • Prototype locally
          Show
          josephkb Joseph K. Bradley added a comment - 5/23/2016 - Week 1 Initial items QA ( SPARK-15490 ) Ping about helping with ( SPARK-14809 ) Ping on ( SPARK-15439 ) If time permits, begin work on Decision Tree API for SparkR Create JIRA Propose API Prototype locally
          Hide
          josephkb Joseph K. Bradley added a comment -

          5/31/2016 - Week 2

          To-do items

          • Send minor PR to link R/DOCUMENTATION.md from R/README.md
          • Help with R programming guide update: SPARK-15672
          • Decision Tree API for SparkR
            • Create JIRA
            • Propose API, referencing MLlib API + R libraries. Could do this on the JIRA, or in a linked doc
            • Create MVP based on existing MLlib APIs
            • Later, we can add more functionality, such as viewing the structure of the tree from R
          Show
          josephkb Joseph K. Bradley added a comment - 5/31/2016 - Week 2 To-do items Send minor PR to link R/DOCUMENTATION.md from R/README.md Help with R programming guide update: SPARK-15672 Decision Tree API for SparkR Create JIRA Propose API, referencing MLlib API + R libraries. Could do this on the JIRA, or in a linked doc Create MVP based on existing MLlib APIs Later, we can add more functionality, such as viewing the structure of the tree from R
          Hide
          josephkb Joseph K. Bradley added a comment -

          6/6/2016 - Week 3

          To-do items

          • Continuation of items from previous week
          • If there is time, start consideration of random forests + boosting.
          • JIRA for tree API: SPARK-15767
          Show
          josephkb Joseph K. Bradley added a comment - 6/6/2016 - Week 3 To-do items Continuation of items from previous week If there is time, start consideration of random forests + boosting. JIRA for tree API: SPARK-15767
          Hide
          josephkb Joseph K. Bradley added a comment -

          6/16/2016 - Week 4

          To-do items

          • Continuation of doc items: SPARK-15672
          • Decision tree API SPARK-15767 -> I'll add notes to this JIRA
          • If there is time, begin work on forests or boosting.
          Show
          josephkb Joseph K. Bradley added a comment - 6/16/2016 - Week 4 To-do items Continuation of doc items: SPARK-15672 Decision tree API SPARK-15767 -> I'll add notes to this JIRA If there is time, begin work on forests or boosting.
          Hide
          vectorijk Kai Jiang added a comment -

          6/22/2016 - Week 5
          To-do items

          • Keep investigating more differences between MLlib API and R API with Decision Tree
          • Same thing to start with Random Forest
          • Continue to change PR for Decision Tree wrapper according to investigation.
          Show
          vectorijk Kai Jiang added a comment - 6/22/2016 - Week 5 To-do items Keep investigating more differences between MLlib API and R API with Decision Tree Same thing to start with Random Forest Continue to change PR for Decision Tree wrapper according to investigation.

            People

            • Assignee:
              vectorijk Kai Jiang
              Reporter:
              josephkb Joseph K. Bradley
              Shepherd:
              Joseph K. Bradley
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development