Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2478

Add Python APIs for decision tree

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.1.0
    • MLlib, PySpark
    • None

    Description

      In v1.0, we only support decision tree in Scala/Java. It would be nice to add Python support. It may require some refactoring of the current decision tree API to make it easier to construct a decision tree algorithm in Python.

      1. Simplify decision tree constructors such that only simple types are used.
      a. Hide the implementation of Impurity from users.
      b. Replace enums by strings.
      2. Make separate public decision tree classes for regression & classification (with shared internals). Eliminate algo parameter.
      3. Implement wrappers in Python for DecisionTree.
      4. Implement wrappers in Python for DecisionTreeModel.

      Attachments

        Issue Links

          Activity

            People

              josephkb Joseph K. Bradley
              mengxr Xiangrui Meng
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: