Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3727

Trees and ensembles: More prediction functionality

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • None
    • None
    • MLlib

    Description

      DecisionTree and RandomForest currently predict the most likely label for classification and the mean for regression. Other info about predictions would be useful.

      For classification: estimated probability of each possible label
      For regression: variance of estimate

      RandomForest could also create aggregate predictions in multiple ways:

      • Predict mean or median value for regression.
      • Compute variance of estimates (across all trees) for both classification and regression.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            josephkb Joseph K. Bradley
            Votes:
            9 Vote for this issue
            Watchers:
            25 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Issue deployment