Issue Details (XML | Word | Printable)

Key: MATH-136
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Phil Steitz
Reporter: john gant
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Commons Math

[math] spearman rank cross correlation

Created: 24/Aug/05 12:24 PM   Updated: 07/Aug/09 09:10 AM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 2.0

Time Tracking:
Not Specified

File Attachments:
  Size
Java Source File RankingAlgorithm.java 2005-08-25 09:03 AM john gant 1.0 kB
Java Source File SpearmanRankCrossCorrelation.java 2005-08-25 09:09 AM john gant 4 kB
Java Source File SpearmanRankCrossCorrelation.java 2005-08-24 12:25 PM john gant 4 kB
Java Source File SpearmanRankCrossCorrelationTest.java 2005-08-25 09:11 AM john gant 5 kB
Java Source File SpearmanRankCrossCorrelationTest.java 2005-08-24 12:26 PM john gant 5 kB
Java Source File TiesEquivalentRank.java 2005-08-25 09:07 AM john gant 2 kB
Java Source File TiesEquivalentRankTest.java 2005-08-25 09:11 AM john gant 2 kB
Environment:
Operating System: other
Platform: Other

Bugzilla Id: 36331
Resolution Date: 21/Jun/09 02:25 AM


 Description  « Hide
added spearman rank correlation, along with unit test

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
john gant added a comment - 24/Aug/05 12:25 PM
Created an attachment (id=16171)
spearman rank cross correlation

john gant added a comment - 24/Aug/05 12:26 PM
Created an attachment (id=16172)
spearman rank cross correlation unit test

Brent Worden added a comment - 24/Aug/05 10:43 PM
I noticed you implementation does not take into account ties in the data. Ties
can effect the correlation coefficeint to a great degree.

A nice approach to address ties is to rank the data adjusting the ranks of tied
elements and then compute the Pearson's correlation coefficient on the rankings.

See http://web.uccs.edu/lbecker/SPSS/ctabs2.htm#5E.%20Spearman%20Correlation,%
20rs for a brief explaination.

If we get into adding more non-parametric routines, I for one would like to see
some general, ranking utililies such as taking an array of data and returning
the ranking array. The ranking could be driven by a tie ranking policy for
dealing with ties in the data. The default policy would be to use the mean
rank for ties. Other policies could be to omit the data, use the highest rank
or, use the lowest rank.


Brent Worden added a comment - 24/Aug/05 11:50 PM
Copied from the mailing list:
----- Original Message -----
From: "John Gant" <john.gant@gmail.com>
>
> Specifically testTwo() in
> http://issues.apache.org/bugzilla/attachment.cgi?id=16172 takes care
> of data with equal value (ie equal rank), is this the type of
> situation to which you are referring? Yes I agree, we should implement
> routines to sort in more diverse ways, but for right now I depend upon
> Arrays.sort() to perform the sorting.

Yes, and in this case the implementation is incorrectly computing the spearman
correlation as -0.1. But, according to R, the correlation is drastically
different:

> x <- c(2.0, 1.0, 3.0, 3.0, 5.0)
> y <- c(4.0, 4.0, 1.0, 2.0, 3.0)
> cor(x, y, method="spearman")
[1] -0.631579

Thus, I hold the implementation needs to change to correctly rank data with
ties.


john gant added a comment - 25/Aug/05 09:03 AM
Created an attachment (id=16184)
[math] ranking

interface for concrete implementations of ranking algorithms


john gant added a comment - 25/Aug/05 09:07 AM
Created an attachment (id=16185)
[math] ranking

Implementation of a generic ranking algorithm where ties get equivalent rank.

Need to implement the following ranking algorithms from R:

http://www.maths.lth.se/help/R/.R/library/base/html/rank.html

a. first
b. random
c. average


john gant added a comment - 25/Aug/05 09:09 AM
Created an attachment (id=16186)
[math] spearman rank cross correlation

Adapted previous code to use Ranking Algorithm implementations.


john gant added a comment - 25/Aug/05 09:11 AM
Created an attachment (id=16187)
[math] spearman rank cross correlation

Replaced faulty unit test, does not take into account all of R's ranking
implementations.


john gant added a comment - 25/Aug/05 09:11 AM
Created an attachment (id=16188)
[math] ranking

corresponding unit test


Phil Steitz added a comment - 21/Jun/09 02:25 AM
Implementation committed in r778085. User guide updated in 786940.

Luc Maisonobe added a comment - 07/Aug/09 09:10 AM
closing resolved issue for 2.0 release