Details

Type: Wish

Status: Resolved

Priority: Minor

Resolution: Duplicate

Affects Version/s: None

Fix Version/s: None

Component/s: Query Processor

Labels:None
Description
The last step in the unholy triumvirate of statistical builtins is the variance. We already have the n (count) and the mean (avg). I currently have a job or two that filters all of the data into a single reducer which just computes mean/n/variance and writes it to a table...so my guess is that this would be a pretty big speed increase. Not a huge deal though, as computing the variance myself is trivial.
(Average, variance, and n can be cocomputed in one pass, so if you're doing var() you can basically have avg() and count() for free.)
Issue Links
Activity
 All
 Comments
 Work Log
 History
 Activity
 Transitions
VAR and SD were added months ago.