For the previous part of this topic, click here.
In part 2 of this topic, we are going to cover the following items:
- Calculations with quantiles
- measures of central tendency (mean, mode, median, range, standard deviation, variance, etc.)
- skewness and kurtosis of distributions
1. Calculations with quatiles
Quantile is the general term for a value at or below which a stated proportion of the data in a distribution lies.
- Quartile - the distribution is divided into quarters
- Quintile - the distribution is divided into fifths
- Decile - the distribution is divided into tenths
- Percentile - the distribution is divided into hundredths (percents)
The equation for the position of the observation at a given percentile y , with n data points sorted in ascending order is:
Ly = (n +
1)y/100
The following example is taken from the CFA Level I curriculum (2011) as an illustration of the concepts above.
No.
|
Company
|
Div Yield (%)
|
No.
|
Company
|
Div Yield (%)
|
1
|
AstraZeneca
|
0.00
|
26
|
UBS
|
2.65
|
2
|
BP
|
0.00
|
27
|
Tesco
|
2.95
|
3
|
Deutsche
Telekom
|
0.00
|
28
|
Total
|
3.11
|
4
|
HSBC
Holdings
|
0.00
|
29
|
GlaxoSmithKline
|
3.31
|
5
|
Credit
Suisse Group
|
0.26
|
30
|
BT
Group
|
3.34
|
6
|
L’Oreal
|
1.09
|
31
|
Unilever
|
3.53
|
7
|
SwissRe
|
1.27
|
32
|
BASF
|
3.59
|
8
|
Roche
Holding
|
1.33
|
33
|
Santander
Central Hispano
|
3.66
|
9
|
Munich
Re Group
|
1.36
|
34
|
Banco
Bilbao Vizcaya Argentina
|
3.67
|
10
|
General
Assicurazioni
|
1.39
|
35
|
Diageo
|
3.68
|
11
|
Vodafone
Group
|
1.41
|
36
|
HBOS
|
3.78
|
12
|
Carrefour
|
1.51
|
37
|
E.ON
|
3.87
|
13
|
Nokia
|
1.75
|
38
|
Shell
Transport and Co.
|
3.88
|
14
|
Novartis
|
1.81
|
39
|
Barclays
|
4.06
|
15
|
Allianz
|
1.92
|
40
|
Royal
Dutch Petroleum Co.
|
4.27
|
16
|
Koninklije
Philips Electronics
|
2.01
|
41
|
Fortis
|
4.28
|
17
|
Siemens
|
2.16
|
42
|
Bayer
|
4.45
|
18
|
Deutsche
Bank
|
2.27
|
43
|
DaimlerChrysler
|
4.68
|
19
|
Telecom
Italia
|
2.27
|
44
|
Suez
|
5.13
|
20
|
AXA
|
2.39
|
45
|
Aviva
|
5.15
|
21
|
Telefonica
|
2.49
|
46
|
Eni
|
5.66
|
22
|
Nestle
|
2.55
|
47
|
ING
Group
|
6.16
|
23
|
Royal
Bank of Scotland Group
|
2.60
|
48
|
Prudential
|
6.43
|
24
|
ABN-AMRO
Holding
|
2.65
|
49
|
Lloyds
TSB
|
7.68
|
25
|
BNP
Paribas
|
2.65
|
50
|
AEGON
|
8.14
|
a. Caluclate the 10th and 90th percentile
b. Calculate first, second, and third quartile
c. Find Median
Answers
a. In this example: n = 50, using the equation
Ly = (n +
1)y/100 for the position of the yth percentile (Py)
For the 10th percentile: L10 = (50 + 1)(10/100) = 5.1
L10 is between the
5th and 6th observations with values X5 = 0.26
(Credit Suisse Group) and X6 = 1.09 (L’Oreal). The estimate of the
10th percentile (first decile) for the dividend yield is
P10 ≈ X5 + (L10
– 5)(X6 – X5) = 0.26 + (5.1 – 5)(1.09 – 0.26) = 0.34%
Range is the distance between the largest and the smallest value in a data set
Mean = (5 + 15 + 22 + 12 + 7)/5 = 12.2%
MAD = (|5 – 12.2| + |15 – 12.2| + |22 – 12.2| + |12 – 12.2| + |7 – 12.2|)/5 = 5.04%
Chebyshev's Inequality states that for any set of observations, whether sample or population data and regardless of the shape of the distribution, the percentage of observations that lie within k standard deviations of the mean is at least 1 – 1/k2 for all k > 1
According to Chebyshev's Inequality, the following relationships hold for any distribution. At least:
3. Coefficient of Variance, Sharpe Ratio,
Coefficient of Variantion (CV) is a statistical measure of the dispersion of data points in a data series around the mean. In the investing world, the coefficient of variation allows you to determine how much volatility (risk) you are assuming in comparison to the amount of return you can expect from your investment.
Example: Given monthly the mean return on T-bills is 0.25% (usually represents risk-free rate) with a standard deviation of 0.36% and the mean monthly return for S&P500 is 1.09% with a standard deviation of 7.3%. Calculate and interprete the CVs of these 2 investments.
The Sharpe Ratio (Reward-to-variability ratio) measures excess return per unit of risk. Investments with large positive Sharpe ratios are preferred to portfolios with smaller ratios.
Note: Limitations of the Sharpe Ratio
For the 90th percentile: L90 = (50 +
1)(90/100) =45.9
L90 is between the
45th and 46th observations with X45 = 5.15 and X46 = 5.66. The estimate of the 90th percentile is
P90
≈ X45 + (L90
– 45)(X46 – X45) = 5.15 + (45.9 – 45)(5.66 – 5.15) = 5.61%
Note: In the calculations above, P10 shows that 10th percentile lies (5.1 – 5) = 10% of the distance between the
5th and 6th observations. The distance between the 5th and 6th observations is 1.09 – 0.26 = 0.83, 10% of that distance is 0.083. We obtain P10 by adding this value (0.083) to the closest observation before L10 (i.e. X5). The calculation for P90 is exactly the same.
b. The first, second, and third quartile correspond to P25, P50, and P75 respectively.
L25 = (50 +
1)(25/100) = 12.75
L50 = (50 +
1)(50/100) = 25.50
L75 = (50 +
1)(75/100) = 38.25
Using the same way we
calculate the positions of the 10th and 90th percentile
in the previous question, we obtain the following results
P25 = Q1
= 1.69% P50 = Q2 =
2.65% P75 = Q3
= 3.93%
c. The median is the 50th percentile, 2.65%.
2. Range, Mean Absolute Deviation, Variance, Standard Deviation, and Chebyshev's Inequality
Range is the distance between the largest and the smallest value in a data set
range = max value – min value
The Mean Absolute Deviation (MAD) is the average of the absolute values of the deviations of individual observations from the arithmetic mean
Population Variance (σ2)
is the average of squared deviations from the mean.
Population Standard Deviation (σ) is a measure of the dispersion of a set of data from its mean. The more
spread apart the data, the higher the deviation. Standard deviation is
calculated as the square root of variance.
Example: Find MAD, variance, and standard deviation of the following set of investment returns [5%, 15%, 22%, 12%, 7%]
Mean = (5 + 15 + 22 + 12 + 7)/5 = 12.2%
MAD = (|5 – 12.2| + |15 – 12.2| + |22 – 12.2| + |12 – 12.2| + |7 – 12.2|)/5 = 5.04%
This result can be interpreted to mean that, on average, an individual return deviate +/- 5.04% from the mean return of 12.2%
Variance = σ2 = [(5 – 12.2)2 + (7 – 12.2)2
+ (12 – 12.2)2 + (15 – 12.2)2 + (22 – 12.2)2]/5
= 36.56 (%2)
Standard Deviation = σ = 6.05%
Sample variance (s2) is the measure of dispersion that applies when we evaluate a sample of n observations from a population.
Sample Standard Deviation (s) is the square root of sample variance
Chebyshev's Inequality states that for any set of observations, whether sample or population data and regardless of the shape of the distribution, the percentage of observations that lie within k standard deviations of the mean is at least 1 – 1/k2 for all k > 1
According to Chebyshev's Inequality, the following relationships hold for any distribution. At least:
- 36% of observations lie within ± 1.25 standard deviations of the mean
- 56% of observations lie within ± 1.50 standard deviations of the mean
- 75% of observations lie within ± 2 standard deviations of the mean
- 89% of observations lie within ± 3 standard deviations of the mean
- 94% of observations lie within ± 4 standard deviations of the mean
3. Coefficient of Variance, Sharpe Ratio,
Coefficient of Variantion (CV) is a statistical measure of the dispersion of data points in a data series around the mean. In the investing world, the coefficient of variation allows you to determine how much volatility (risk) you are assuming in comparison to the amount of return you can expect from your investment.
Example: Given monthly the mean return on T-bills is 0.25% (usually represents risk-free rate) with a standard deviation of 0.36% and the mean monthly return for S&P500 is 1.09% with a standard deviation of 7.3%. Calculate and interprete the CVs of these 2 investments.
CVT-bills = 0.36/0.25 = 1.44
CVS&P500 = 7.3/1.09 = 6.70
The reults indicate that there is less dispersion (risk) per unit of monthly return for T-bills than for S&P500The Sharpe Ratio (Reward-to-variability ratio) measures excess return per unit of risk. Investments with large positive Sharpe ratios are preferred to portfolios with smaller ratios.
Note: Limitations of the Sharpe Ratio
- If 2 porfolios have negative Sharpe ratios, it is not necessarily true that the higher Sharpe ratio means better risk-adjusted performance.
- Sharpe ratio is useful when standard deviation is an appropriate measure of risk. However, investment strategies with option characteristics have asymmetric return distributions (i.e. large probability of small gains and small probability of large losses). In such cases, standard deviation may underestimate risk and produce high Sharpe ratios.
4. Skewness and Kurtosis
A distribution is symmetrical if it is shaped identically on both sides of its mean. In finance, it means that intervals of losses and gains will exhibit the same frequency.
Skewness refers to the extent to which a distribution is not symmetrical. This depends on the occurrence of outliers in the data set. Outliers are the observations with extraordinary large values, either positive ornegative
- A positively skewed distribution is chracterized by many outliers in the upper region (right tail).
- A negatively skewed distribution has many outliers in the lower region (left tail)
Kurtosis is a measure of the degree to which a distribution is more or less "peaked" than a normal distribution.
- Leptokurtic - more peaked than a normal distribution
- Platykurtic - flatter than a normal distribution
- Mesokurtic - same kurtosis as a normal distribution
The kurtosis for normal distribution is 3. If a distribution has more or less kurtosis than the normal distribution, it is said to exhibit excess kurtosis.
- Normal distribution has excess kurtosis = 0
- Leptokurtic distribution has excess kurtosis > 0
- Platykurtic distribution has excess kurtosis < 0
To find out the skewness of a sample, apply the following formula
Note: if |SK| > 0.5, the distribution has a significant level of skewness
Sample Kurtosis is measured using the following formula
The sample kurtosis is measured relative to the kurtosis of a normal distribution, which is 3.
Excess Kurtosis = Sample Kurtosis – 3
Excess kurtosis > 0, the distribution is leptokurtic (more peaked, fat tail)
Excess kurtosis < 0, the distribution is platokurtic (less peaked, thin tail)
Excess kurtosis > 1 in absolute value is considered large.
The graphs showing different kurtosis really just show different variances. Also, kurtosis measures tails (outliers) only, not "peakedness" or "flatness."
ReplyDelete