Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3157

Avoid duplicated stats in DecisionTree extractLeftRightNodeAggregates

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 1.2.0
    • MLlib
    • None

    Description

      Improvement: computation, memory usage

      For ordered features, extractLeftRightNodeAggregates() computes pairs of cumulative sums. However, these sums are redundant since they are simply cumulative sums accumulating from the left and right ends, respectively. Only compute one sum.
      For unordered features, the left and right aggregates are essentially the same data, copied from the original aggregates, but shifted by one index. Avoid copying data.

      Attachments

        Issue Links

          Activity

            People

              josephkb Joseph K. Bradley
              josephkb Joseph K. Bradley
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: