|
I noticed you implementation does not take into account ties in the data. Ties
can effect the correlation coefficeint to a great degree. A nice approach to address ties is to rank the data adjusting the ranks of tied See http://web.uccs.edu/lbecker/SPSS/ctabs2.htm#5E.%20Spearman%20Correlation,% If we get into adding more non-parametric routines, I for one would like to see Copied from the mailing list:
----- Original Message ----- From: "John Gant" <john.gant@gmail.com> > > Specifically testTwo() in > http://issues.apache.org/bugzilla/attachment.cgi?id=16172 > of data with equal value (ie equal rank), is this the type of > situation to which you are referring? Yes I agree, we should implement > routines to sort in more diverse ways, but for right now I depend upon > Arrays.sort() to perform the sorting. Yes, and in this case the implementation is incorrectly computing the spearman > x <- c(2.0, 1.0, 3.0, 3.0, 5.0) Thus, I hold the implementation needs to change to correctly rank data with Created an attachment (id=16185)
[math] ranking Implementation of a generic ranking algorithm where ties get equivalent rank. Need to implement the following ranking algorithms from R: http://www.maths.lth.se/help/R/.R/library/base/html/rank.html a. first Implementation committed in r778085. User guide updated in 786940.
closing resolved issue for 2.0 release
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
spearman rank cross correlation