Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3380

DecisionTree: overflow and precision in aggregation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 1.1.0
    • None
    • MLlib

    Description

      DecisionTree does not check for overflows or loss of precision while aggregating sufficient statistics (binAggregates). It uses Double, which may be a problem for DecisionTree regression since the variance calculation could blow up. At the least, it could check for overflow and renormalize as needed.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              josephkb Joseph K. Bradley
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: