Hi Phil. No worries - its the holiday season and it's an all volunteer project. I find you and Luc to be very responsive all things given.
I actually find the process of contributing to the project very straightforward - it's intimidating until you try, but you both have been very easy to work with.
Short Answer - we'll be talking about a UnivariateStatistic really soon.
Long Answer - Addressing your questions (in no particular order)....
This link will help somewhat - http://thecuriousinvestor.com/2007/10/03/sortino-ratio/
1. Some of the tests will occur twice in some of the functions. I copied the four methods for variance to create the semi-variance methods - I'm having second thoughts about that since I really would only call the one that takes the array or possibly the array and the mean. I've never needed to compute semi-variance for a part of an array.
I'm very flexible on this - I went for consistency rather than the best performance possible - feel free to make changes as you see fit.
2. The Sortino ratio uses the downward standard deviation, which is the square root of the semivariance. You definitely would want to keep the values above the mean and collapse them rather than exclude them altogether.
A simple explanation of the Sortino Ratio is probably in order to explain why.
The Sortino Ratio comes from the Sharpe Ratio. The Sharpe Ratio is used to rate how much reward you're getting for the risk you're taking. Standard Deviation is the divisor. Higher is better.
One criticism of the Sharpe Ratio is that returns in excess of the mean increase your standard deviation, but you don't getting big rewards from time to time - those periods shouldnt count.
An example helps here
Here's your monthly returns.
That 20% return is nice (I'd like a 20% monthly return too!) but that 20% makes your standard deviation higher and your Sharpe Ratio lower.
But nobody minds an occassional blowout return - that doesnt increase the risk of the fund in the view of people who prefer the Sortino Ratio.
Mean = 4, Variance = 52.333, Std Dev = 7.234
But for the semivariance calculation, that 20 becomes a 4 and stays in the set, so semivariance becomes 9.666 and the downward standard deviation becomes 3.109. Dropping out the 20% isn't appropriate because it's part of your return.
That's a big difference in how your performance looks - 7.234 / 3.109 = 2.326. Your fund now looks 2.3 times better than it did before!
That "other" parameter is often called the "minimum acceptable return". Some people prefer to look at only when you lost money or would fail to meet some regular performance metric. Pension managers get estimates on what they need to make that year and will look for people who have the best chance of making that return.
Instead of looking at the mean, they have a minimum return that they want to measure off of.
So, from the example above, let's say that the pension manager has a minimum acceptable return of 2%.
In that case, the 2% and 20% returns dont add into the semivariance so you get a semivariance of 8.333
This is where that zero initially came from.
Okay, now here's the little curveball that initially sounds bad but really turns into something easy:
Some years, you lose money. The mean is below zero. In those cases, that minimum acceptable return is "sometimes" the minimum of the mean and zero. Rather than putting in code that automatically checks the mean and replaces the minimum acceptable return, if we have code that takes in the double array and the MAR, I think we're fine.
Okay, now that I've made a really long JIRA comment, what's the next step?
Should I rewrite as a UnivariateStatistic? Should I just make the semivariance code two methods - one that takes the array and one that takes the array and a MAR? Or has this entry been so long I've made you want to hit the eggnog?