Sharpe Ratio

Mar 30, 2018

A Sharper Sharpe: Just Shrink it!

Note: This blog post previously analyzed Damien Challet's 'Sharper estimator' of the Signal-Noise Ratio. Following up on some suspicions, we compared the drawdown estimator to a shrunk version of the moment-based estimator, and found them to have similar performance. However, the analysis used version 1.1 of the sharpeRratio, written by Challet. That version of the package contained a bug which severely biased the estimator, causing illusory improvements in the achieved standard error. Challet has fixed the package, which is now at version 1.2 (or later), and thus the analysis that was here can no longer be reproduced. We will perform an investigation of the fixed drawdown estimator, and link to it here.

Click to read and post comments

Mar 04, 2018

Improved estimation of Signal Noise Ratio via moments

In a series of blog posts I have looked at Damien Challet's drawdown estimator of the Signal to Noise ratio. My simulations indicate that this estimator achieves its apparent efficiency at the cost of some bias. Here I make a brief attempt at 'improving' the usual moment-based estimator, the Sharpe ratio, by adding some extra terms. If you want to play along at home, the rest of this blog post is available as a jupyter notebook off of a gist.


Let \(\mu\), and \(\sigma\) be the mean and standard deviation of the returns of an asset. Then \(\zeta = \mu / \sigma\) is the "Signal to Noise Ratio" (SNR). Typically the SNR is estimated with the Sharpe Ratio, defined as \(\hat{\zeta} = \hat{\mu} / \hat{\sigma}\), where \(\hat{\mu}\) and \(\hat{\sigma}\) are the vanilla sample estimates. Can we gain efficiency in the case where the returns have significant skew and kurtosis?

Here we consider an estimator of the form

$$ v = a_0 + \frac{a_1 + \left(1+a_2\right)\hat{\mu} + a_3 \hat{\mu}^2}{\hat{\sigma}} + a_4 \left(\frac{\hat{\mu}}{\hat{\sigma}}\right)^2. $$

The Sharpe Ratio corresponds to \(a_0 = a_1 = a_2 = a_3 = a_4 = 0\). Note that we were inspired by Norman Johnson's work on t-tests under skewed distributions. Johnson considered a similar setup, but with only \(a_1, a_2,\) and \(a_3\) free, and was concerned with the problem of hypothesis testing on \(\mu\).

Below, following Johnson, I will use the Cornish Fisher expansions of \(\hat{\mu}\) and \(\hat{\sigma}\) to approximate \(v\) as a function of the first few cumulants of the distribution, and some normal variates. I will then compute the mean square error, \(E\left[\left(v - \zeta\right)^2\right],\) and take its derivative with respect to \(a_i\). Unfortunately, we will find that the first order conditions are solved by \(a_i=0\), which is to say that the vanilla Sharpe has the lowest MSE of estimators of this kind. Our adventure will take us far, but we will return home empty handed.

We proceed.

# load what we need from sympy
from __future__ import division
from sympy import *
from sympy import Order
from sympy.assumptions.assume import global_assumptions
from sympy.stats import P, E, variance, Normal
init_printing()
nfactor = 4

# define some symbols.
a0, a1, a2, a3, a4 = symbols('a_0 a_1 a_2 a_3 a_4',real=True)
n, sigma = symbols('n \sigma',real=True,positive=True)
zeta, mu3, mu4 = symbols('\zeta \mu_3 \mu_4',real=True)
mu = zeta * sigma

We now express \(\hat{\mu}\) and \(\hat{\sigma}^2\) by the Cornish Fisher expansion. This is an expression of the distribution of a random variable in terms of its cumulants and a normal variate. The expansion is ordered in a way such that when applied to the mean of independent draws of a distribution, the terms are clustered by the order of \(n\). The Cornish Fisher expansion also involves the Hermite polynomials. The expansions of \(\hat{\mu}\) and \(\hat{\sigma}^2\) are not independent. We follow Johnson in expression the correlation of normals and truncating:

# probabilist's hermite polynomials
def Hen(x,n):
    return (2**(-n/2) * hermite(n,x/sqrt(2)))

# this comes out of the wikipedia page:
h1 = lambda x : Hen(x,2) / 6
h2 = lambda x : Hen(x,3) / 24
h11 = lambda x : - (2 * Hen(x,3) + Hen(x,1)) / 36

# mu3 is the 3rd centered moment of x
gamma1 = (mu3 / (sigma**(3/2))) / sqrt(n)
gamma2 = (mu4 / (sigma**4)) / n

# grab two normal variates with correlation rho
# which happens to take value:
# rho = mu3 / sqrt(sigma**2 * (mu4 - sigma**4))
z1 = Normal('z_1',0,1)
z3 = Normal('z_3',0,1)
rho = symbols('\\rho',real=True)
z2 = rho * z1 + sqrt(1-rho**2)*z3

# this is out of Johnson, but we call it mu hat instead of x bar:
muhat = mu + (sigma/sqrt(n)) * (z1 + gamma1 * h1(z1) + gamma2 * h2(z1) + gamma1**2 * h11(z1))
muhat
$$\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right)\\ + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) \\ + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)$$
addo = sqrt((mu4 - sigma**4) / (n * sigma**4)) * z2
# this is s^2 in Johnson:
sighat2 = (sigma**2) * (1 + addo)
# use Taylor's theorem to express sighat^-1:
invs = (sigma**(-1)) * (1 - (1/(2*sigma)) * addo)
invs
$$\frac{1}{\sigma} \left(1 - \frac{\sqrt{\mu_4 - \sigma^{4}}}{2 \sigma^{3} \sqrt{n}} \left(\rho z_{1} + \sqrt{- \rho^{2} + 1} z_{3}\right)\right)$$
# the new statistic; it is v = part1 + part2 + part3
part1 = a0
part2 = (a1 + (1+a2)*muhat + a3 * muhat**2) * invs
part3 = a4 * (muhat*invs)**2

v = part1 + part2 + part3
v
$$a_{0} + \frac{1}{\sigma} \left(1 - \frac{\sqrt{\mu_4 - \sigma^{4}}}{2 \sigma^{3} \sqrt{n}} \left(\rho z_{1} + \sqrt{- \rho^{2} + 1} z_{3}\right)\right) \left(a_{1} \\ + a_{3} \left(\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) \\ + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) \\ + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)\right)^{2} \\ + \left(a_{2} + 1\right) \left(\sigma \zeta + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) \\ + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) \\ + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)\right)\right) \\ + \frac{a_{4}}{\sigma^{2}} \left(1 - \frac{\sqrt{\mu_4 - \sigma^{4}}}{2 \sigma^{3} \sqrt{n}} \left(\rho z_{1} + \sqrt{- \rho^{2} + 1} z_{3}\right)\right)^{2} \left(\sigma \zeta \\ + \frac{\sigma}{\sqrt{n}} \left(\frac{\mu_3^{2}}{\sigma^{3.0} n} \left(- 0.0392837100659193 \sqrt{2} z_{1}^{3} + 0.0982092751647983 \sqrt{2} z_{1}\right) \\ + \frac{\mu_3}{\sigma^{1.5} \sqrt{n}} \left(0.166666666666667 z_{1}^{2} - 0.166666666666667\right) \\ + \frac{\mu_4}{\sigma^{4} n} \left(0.0294627825494395 \sqrt{2} z_{1}^{3} - 0.0883883476483184 \sqrt{2} z_{1}\right) + z_{1}\right)\right)^{2}$$

That's a bit hairy. Here I truncate that statistic in \(n\). This was hard for me to figure out in sympy, so I took a limit. (I like how 'oo' is infinity in sympy.)

#show nothing
v_0 = limit(v,n,oo)
v_05 = v_0 + (limit(sqrt(n) * (v - v_0),n,oo) / sqrt(n))
v_05
$$\frac{1}{\sigma^{17.0} \sqrt{n}} \left(- 0.5 \rho \sigma^{13.0} a_{1} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 1.0 \rho \sigma^{14.0} \zeta^{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \rho \sigma^{14.0} \zeta a_{2} \sqrt{\mu_4 - \sigma^{4}} z_{1} \\ - 0.5 \rho \sigma^{14.0} \zeta \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \rho \sigma^{15.0} \zeta^{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} z_{1} - 0.5 \sigma^{13.0} a_{1} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} \\ - 1.0 \sigma^{14.0} \zeta^{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} - 0.5 \sigma^{14.0} \zeta a_{2} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} \\ - 0.5 \sigma^{14.0} \zeta \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} - 0.5 \sigma^{15.0} \zeta^{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} \sqrt{- \rho^{2} + 1} z_{3} + 2.0 \sigma^{17.0} \zeta a_{4} z_{1} \\ + 1.0 \sigma^{17.0} a_{2} z_{1} + 1.0 \sigma^{17.0} z_{1} + 2.0 \sigma^{18.0} \zeta a_{3} z_{1}\right) + \frac{1}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} \\ + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma \zeta + \sigma a_{0} + a_{1}\right)$$

Now we define the error as \(v - \zeta\) and compute the approximate bias and variance of the error. We sum the variance and squared bias to get mean square error.

staterr = v_05 - zeta
# mean squared error of the statistic v, is
# MSE = E((newstat - zeta)**2)
# this is too slow, though, so evaluate them separately instead:
bias = E(staterr)
simplify(bias)
$$\sigma \zeta^{2} a_{3} + \zeta^{2} a_{4} + \zeta a_{2} + a_{0} + \frac{a_{1}}{\sigma}$$
# variance of the error:
varerr = variance(staterr)
MSE = (bias**2) + varerr 
collect(MSE,n)
$$\left(- \zeta + \frac{1}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma \zeta + \sigma a_{0} + a_{1}\right)\right)^{2} \\ + \frac{1}{n} \left(\frac{0.25 \mu_4}{\sigma^{8.0}} a_{1}^{2} + \frac{1.0 \mu_4}{\sigma^{7.0}} \zeta^{2} a_{1} a_{4} + \frac{0.5 \mu_4}{\sigma^{7.0}} \zeta a_{1} a_{2} + \frac{0.5 \mu_4}{\sigma^{7.0}} \zeta a_{1} + \frac{1.0 \mu_4}{\sigma^{6.0}} \zeta^{4} a_{4}^{2} + \frac{1.0 \mu_4}{\sigma^{6.0}} \zeta^{3} a_{2} a_{4} \\ + \frac{1.0 \mu_4}{\sigma^{6.0}} \zeta^{3} a_{4} + \frac{0.5 \mu_4}{\sigma^{6.0}} \zeta^{2} a_{1} a_{3} + \frac{0.25 \mu_4}{\sigma^{6.0}} \zeta^{2} a_{2}^{2} + \frac{0.5 \mu_4}{\sigma^{6.0}} \zeta^{2} a_{2} + \frac{0.25 \mu_4}{\sigma^{6.0}} \zeta^{2} + \frac{1.0 \mu_4}{\sigma^{5.0}} \zeta^{4} a_{3} a_{4} \\ + \frac{0.5 \mu_4}{\sigma^{5.0}} \zeta^{3} a_{2} a_{3} + \frac{0.5 \mu_4}{\sigma^{5.0}} \zeta^{3} a_{3} + \frac{0.25 \mu_4}{\sigma^{4.0}} \zeta^{4} a_{3}^{2} - \frac{2.0 \rho}{\sigma^{4.0}} \zeta a_{1} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{4.0}} a_{1} a_{2} \sqrt{\mu_4 - \sigma^{4}} \\ - \frac{1.0 \rho}{\sigma^{4.0}} a_{1} \sqrt{\mu_4 - \sigma^{4}} - \frac{4.0 \rho}{\sigma^{3.0}} \zeta^{3} a_{4}^{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{4.0 \rho}{\sigma^{3.0}} \zeta^{2} a_{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{4.0 \rho}{\sigma^{3.0}} \zeta^{2} a_{4} \sqrt{\mu_4 - \sigma^{4}} \\ - \frac{2.0 \rho}{\sigma^{3.0}} \zeta a_{1} a_{3} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{3.0}} \zeta a_{2}^{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{2.0 \rho}{\sigma^{3.0}} \zeta a_{2} \sqrt{\mu_4 - \sigma^{4}} - \frac{1.0 \rho}{\sigma^{3.0}} \zeta \sqrt{\mu_4 - \sigma^{4}} \\ - \frac{6.0 \rho}{\sigma^{2.0}} \zeta^{3} a_{3} a_{4} \sqrt{\mu_4 - \sigma^{4}} - \frac{3.0 \rho}{\sigma^{2.0}} \zeta^{2} a_{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} - \frac{3.0 \rho}{\sigma^{2.0}} \zeta^{2} a_{3} \sqrt{\mu_4 - \sigma^{4}} - \frac{2.0 \rho}{\sigma^{1.0}} \zeta^{3} a_{3}^{2} \sqrt{\mu_4 - \sigma^{4}} \\ - \frac{0.25 a_{1}^{2}}{\sigma^{4.0}} - \frac{1.0 a_{1}}{\sigma^{3.0}} \zeta^{2} a_{4} - \frac{0.5 \zeta}{\sigma^{3.0}} a_{1} a_{2} - \frac{0.5 \zeta}{\sigma^{3.0}} a_{1} - \frac{1.0 \zeta^{4}}{\sigma^{2.0}} a_{4}^{2} - \frac{1.0 a_{2}}{\sigma^{2.0}} \zeta^{3} a_{4} \\ - \frac{1.0 a_{4}}{\sigma^{2.0}} \zeta^{3} - \frac{0.5 a_{1}}{\sigma^{2.0}} \zeta^{2} a_{3} - \frac{0.25 \zeta^{2}}{\sigma^{2.0}} a_{2}^{2} - \frac{0.5 a_{2}}{\sigma^{2.0}} \zeta^{2} - \frac{0.25 \zeta^{2}}{\sigma^{2.0}} - \frac{1.0 a_{3}}{\sigma^{1.0}} \zeta^{4} a_{4} \\ - \frac{0.5 a_{2}}{\sigma^{1.0}} \zeta^{3} a_{3} - \frac{0.5 a_{3}}{\sigma^{1.0}} \zeta^{3} + 8.0 \sigma^{1.0} \zeta^{2} a_{3} a_{4} + 4.0 \sigma^{1.0} \zeta a_{2} a_{3} + 4.0 \sigma^{1.0} \zeta a_{3} + 4.0 \sigma^{2.0} \zeta^{2} a_{3}^{2} \\ - 0.25 \zeta^{4} a_{3}^{2} + 4.0 \zeta^{2} a_{4}^{2} + 4.0 \zeta a_{2} a_{4} + 4.0 \zeta a_{4} + 1.0 a_{2}^{2} + 2.0 a_{2} + 1.0\right)$$

That's really involved, and finding the derivative will be ugly. Instead we truncate at \(n^{-1}\), which leaves us terms constant in \(n\). Looking above, you will see that removing terms in \(n^{-1}\) leaves some quantity squared. That is what we will minimize. The way forward is fairly clear from here.

# truncate!
MSE_0 = limit(collect(MSE,n),n,oo)
MSE_1 = MSE_0 + (limit(n * (MSE - MSE_0),n,oo)/n)
MSE_0
$$\frac{1}{\sigma^{2}} \left(\sigma^{4} \zeta^{4} a_{3}^{2} + 2 \sigma^{3} \zeta^{4} a_{3} a_{4} + 2 \sigma^{3} \zeta^{3} a_{2} a_{3} + 2 \sigma^{3} \zeta^{2} a_{0} a_{3} + \sigma^{2} \zeta^{4} a_{4}^{2} + 2 \sigma^{2} \zeta^{3} a_{2} a_{4} + 2 \sigma^{2} \zeta^{2} a_{0} a_{4} \\ + 2 \sigma^{2} \zeta^{2} a_{1} a_{3} + \sigma^{2} \zeta^{2} a_{2}^{2} + 2 \sigma^{2} \zeta a_{0} a_{2} + \sigma^{2} a_{0}^{2} + 2 \sigma \zeta^{2} a_{1} a_{4} + 2 \sigma \zeta a_{1} a_{2} + 2 \sigma a_{0} a_{1} + a_{1}^{2}\right)$$

Now we take the derivative of the Mean Square Error with respect to the \(a_i\). In each case we will get an equation linear in all the \(a_i\). The first order condition, which corresponds to minimizing the MSE, occurs for \(a_i=0\).

# a_0
simplify(diff(MSE_0,a0))
$$2 \sigma \zeta^{2} a_{3} + 2 \zeta^{2} a_{4} + 2 \zeta a_{2} + 2 a_{0} + \frac{2 a_{1}}{\sigma}$$
# a_1
simplify(diff(MSE_0,a1))
$$2 \zeta^{2} a_{3} + \frac{2 a_{4}}{\sigma} \zeta^{2} + \frac{2 \zeta}{\sigma} a_{2} + \frac{2 a_{0}}{\sigma} + \frac{2 a_{1}}{\sigma^{2}}$$
# a_2
simplify(diff(MSE_0,a2))
$$\frac{2 \zeta}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma a_{0} + a_{1}\right)$$
# a_3
simplify(diff(MSE_0,a3))
$$2 \zeta^{2} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma a_{0} + a_{1}\right)$$
# a_4
simplify(diff(MSE_0,a4))
$$\frac{2 \zeta^{2}}{\sigma} \left(\sigma^{2} \zeta^{2} a_{3} + \sigma \zeta^{2} a_{4} + \sigma \zeta a_{2} + \sigma a_{0} + a_{1}\right)$$

To recap, the minimal MSE occurs for \(a_0 = a_1 = a_2 = a_3 = a_4 = 0\). We must try another approach.

Click to read and post comments

Mar 02, 2018

A Sharper Sharpe III : MLEs

Note: This blog post previously analyzed Damien Challet's 'Sharper estimator' of the Signal-Noise Ratio. We found that this drawdown estimator seemed empirically to have lower MSE than the Cramér Rao Lower Bound when used to estimate the mean of some distributions, which was a warning that something was wrong. The analysis used version 1.1 of the sharpeRratio, written by Challet. That version of the package contained a bug which severely biased the estimator, causing illusory improvements in the achieved standard error. Challet has fixed the package, which is now at version 1.2 (or later), and thus the analysis that was here can no longer be reproduced. We will perform an investigation of the fixed drawdown estimator, and link to it here.

Click to read and post comments

Feb 24, 2018

A Sharper Sharpe II : Skewed Distributions

Note: This blog post previously analyzed Damien Challet's 'Sharper estimator' of the Signal-Noise Ratio. The analysis used version 1.1 of the sharpeRratio, written by Challet. That version of the package contained a bug which severely biased the estimator, causing illusory improvements in the achieved standard error. Challet has fixed the package, which is now at version 1.2 (or later), and thus the analysis that was here can no longer be reproduced. We will perform an investigation of the fixed drawdown estimator, and link to it here.

Click to read and post comments
← Previous Next → Page 4 of 5

atom feed · Copyright © 2018-2023, Steven E. Pav.  
The above references an opinion and is for information purposes only. It is not offered as investment advice.