Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12326 Move GBT implementation from spark.mllib to spark.ml
  3. SPARK-12381

Copy public decision tree helper classes from spark.mllib to spark.ml and make private

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • ML, MLlib
    • None

    Description

      The helper classes for decision trees and decision tree ensembles (e.g. Impurity, InformationGainStats, ImpurityStats, DTStatsAggregator, etc...) currently reside in spark.mllib, but as the algorithm implementations are moved to spark.ml, so should these helper classes.

      We should take this opportunity to make some of those helper classes private when possible (especially if they are only needed during training) and maybe change the APIs (especially if we can eliminate duplicate data stored in the final model).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              sethah Seth Hendrickson
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: