Details

Type: New Feature

Status: Open

Priority: Major

Resolution: Unresolved

Affects Version/s: 0.5.0

Fix Version/s: None

Component/s: None

Labels:None

Environment:
UDF, written in Pig 0.5 contrib/

Tags:contrib udf variance standard deviation
Description
I've implemented a UDF in Pig 0.5 that implements Algebraic and calculates variance in a distributed manner, based on the AVG() builtin. It works by calculating the count, sum and sum of squares, as described here: http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
Is this a worthwhile contribution? Taking the square root of this value using the contrib SQRT() function gives Standard Deviation, which is missing from Pig.
Activity
rjurney
created issue 
Olga Natkovich
made changes 
Field  Original Value  New Value 

Fix Version/s  0.7.0 [ 12314397 ]  
Fix Version/s  0.5.0 [ 12314213 ] 
Dmitriy V. Ryaboy
made changes 
Fix Version/s  0.8.0 [ 12314562 ]  
Fix Version/s  0.7.0 [ 12314397 ] 
Dmitriy V. Ryaboy
made changes 
Assignee  Dmitriy V. Ryaboy [ dvryaboy ] 
Olga Natkovich
made changes 
Fix Version/s  0.9.0 [ 12315191 ]  
Fix Version/s  0.8.0 [ 12314562 ] 
Olga Natkovich
made changes 
Fix Version/s  0.9.0 [ 12315191 ] 
Gavin
made changes 
Reporter  Russell Jurney [ rjurney ]  Russell Jurney [ russell.jurney ] 
Jenny Thompson
made changes 
Attachment  PIG1150.patch [ 12639895 ] 
Yes, it is definitely worse while to contribute!