Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3022

FindBinsForLevel in decision tree should call findBin only once for each feature

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.2
    • 1.1.0
    • MLlib
    • None

    Description

      `findbinsForLevel` is applied to every `LabeledPoint` to find bins for all nodes at a given level. Given a specific `LabeledPoint` and a specific feature, the bin to put this labeled point should always be same.But in current implementation, `findBin` on a (labeledpoint, feature) pair is called for every node at a given level, which is a waste of computation. I proposed to call `findBin` only once and if a `LabeledPoint` is valid on a node, this result can be reused.

      Attachments

        Activity

          People

            chouqin Qiping Li
            chouqin Qiping Li
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 4h
                4h
                Remaining:
                Remaining Estimate - 4h
                4h
                Logged:
                Time Spent - Not Specified
                Not Specified