Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.5.0
-
None
Description
MultivariateOnlineSummarizer for weighted instances is implemented as private API for SPARK-7685.
In SPARK-7685, the online numerical stable version of unbiased estimation of variance defined by the reliability weights: [https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Reliability_weights] is implemented, but we would like to make it as public api since there are different use-cases.
Currently, `count` will return the actual number of instances, and ignores instance weights, but `numNonzeros` will return the weighted # of nonzeros.
We need to decide the behavior of them before making it public.