Uploaded by fofal56408

Quantitative Methods (with notes) Final

advertisement
1
2
3
4
Even if we could observe all the members of a population, it is often costly to do
so in terms of time, money, and/or computing power. The additional benefit
gained is not typically worth the additional cost. The field of statistics is
specifically designed to alleviate the need to observe an entire population.
5
Although it is possible to know parameters, it is most often the case that we don’t know
them with certainty. The field of inferential statistics is devoted to inferring population
parameters from sample characteristics.
6
In order to create valid statistical inferences, we must choose methods that are
valid for the data we use. The underlying nature of the data affect the choice of
statistical modeling techniques.
Nominal Scales
Nominal scales are purely categorical, and you cannot perform mathematical
transformations on them nor make inferences based on their values without
further statistical analysis. They are not rank‐ordered. This is the weakest form of
measurement scale. An assigned nominal value of 1 is not necessarily “better”
than an assigned nominal value of 2, nor is the value of 2 twice that of 1.
Ordinal Scales
Ordinal scales are ranking scales wherein we can determine which ranks are
“better,” but not necessarily by how much. For example, with an ordinal scale, a 2
may be better than a 1, but we cannot necessarily say that 2 is twice as good as
1. We cannot perform mathematical transformation on data of this type either.
Interval Scales
Interval scales provide both ranking and equal differences between scale values;
7
hence, continuing our prior example, 2 is better than 1 and 2 is a single unit better
than 1 but still not necessarily twice as good. With interval scales, the zero point
of the scale does not indicate the absence of the attribute. An easy example of
such a scale is temperature. Without a true zero point, we cannot form ratios.
Ratio Scales
Ratio scales are interval scales with a true zero point. We can perform a variety of
mathematical and statistical operations on such scales.
The “strength” of the scale is a function of our ability to analyze the data meaningfully
using mathematics and statistics and it increases as we increase the “informativeness” of
the data in a mathematical sense, meaning more applicable operations mean more
“informativeness.”
7
We will use these returns for the next several slides.
Equation (on slide):
Pt = Price per share at the end of time period t
Pt − 1 = Price per share at the end of me period t − 1, the me period
immediately preceding time period t
Dt = Cash distributions received during time period t
Two important features for this specific return calculation: 1) an element of time
is associated with it, and 2) rate of return has no unit associated with it (for
example, currency).
8
9
10
Most statistical software and mathematical analysis packages, including Excel, will
automatically form frequency distributions for the user. The user can typically specify
the values in Steps 3 and 4 (functionally related to 3).
Guidelines for setting k vary, but generally speaking, the balance is between
summarizing the data enough to be useful, but also not losing relevant characteristics.
11
12
This example uses holding period returns to demonstrate forming a frequency
table.
For interval width, we show that [11.43 – (–4.57)]/4 is approximately 4.
Ideally, we’d like for the intervals in our distribution to be easy to identify and
understand. That likely means using integers as interval endpoints, ensuring that
we have few empty intervals (preferably, none). If possible, it also means
ensuring that our distribution “breaks” at natural points. A good example of this
with returns is to ensure the distribution breaks at “zero” so we can classify
returns as positive or negative.
Ultimately, we want the frequency distribution to give us an idea of the
centrality, shape, and dispersion of the data. We should have an idea of where
most of the observations lie and the relative distribution of observations across
the possible range of values.
13
The prior example has been continued for ease of exposition.
Relative frequency is the absolute frequency value for a given interval divided by the
total number of observations (in this case 12).
Cumulative frequency is the sum of the relative frequency value for that cell and the
prior cells in the relative frequency column; hence,
0.583 = (0.250 + 0.333), and so on.
This is a good moment to analyze the table in the slide for the properties mentioned in
the prior slide. The frequency distribution fails on several counts, most notably not
having a “natural” break at zero or using interval cutoffs. Given the underlying data, this
could easily have been accomplished without changing the table at all (there are no
observations between ‐0.57 and 0). Cutoffs giving the same distribution are
–5 < observation < 0
0 < observations < 4
4 < observation < 7
7 < observation < 12
14
A histogram presents the data from a frequency distribution as a graphical
representation of ordered data and magnitude by interval classification. It is designed to
provide a more visual and intuitive sense of the centrality and dispersion of the data. It
should be noted that the user should ensure that the intervals are ordered in such a
manner that the histogram preserves the ordering of the original data.
Most mathematic and statistic packages, including Excel, will produce histograms.
15
We construct a frequency polygon by connecting the midpoints of the absolute
frequencies in a straight line. In essence, we are just replacing the highest points
of the bars with a straight line. This tool is quite useful when we have a large
number of categories.
Frequency polygons have a higher visual continuity than histograms. Steep slopes
indicate higher rates of change as you move from one interval to the next, and
shallow slopes indicate lower rates of change. In the cumulative absolute
distribution frequency polygon, the slope is proportional to the number of
observations in that interval. The user should note that a frequency polygon, like
a histogram, needs ordered data along its x‐axis.
16
For Arithmetic mean — Please take note of the difference in notation: the Greek letter
for population parameters and caps for the variable and count; X‐bar for the sample
statistic and lower case for the variable and count. You may find the center of gravity
analogy useful for providing the intuition behind the mean as a calculation.
Arithmetic means are by far the most commonly used statistic in the investments arena
and are generally viewed as a measure of the typical outcome.
Population (sample) arithmetic means are means calculated for a population (sample).
Note that we don’t normally expect the arithmetic mean to equal the value of any of the
observations.
Arithmetic means have a number of statistical properties, including sensitivity to outliers
(weakness) and use of all available data about the magnitude of the observations
(strength).
17
The MSCI EAFE Index is designed to represent the performance of large and mid‐
cap securities across 21 developed markets, including countries in Europe,
Australasia and the Far East, excluding the U.S. and Canada. The Index is available
for a number of regions, market segments/sizes and covers approximately 85% of
the free float‐adjusted market capitalization in each of the 21 countries.
18
This is a visualization of the mean. The fulcrum in this diagram is placed at the
mean as calculated on the prior slide. Note that this means it needs to be slightly
to the left of the geometric center (or Norway’s vertical bar with median of –
29.72%) of the distribution because some of the countries on the left had large
negative returns.
19
20
Weighted mean point
With a market‐value strategy, or constant proportions strategy, you can measure mean
first and then the weight or the weight first and then measure mean.
Weighted averages occur throughout finance in all areas, including corporate finance
(weighted average cost of capital) and investments (portfolio returns).
A key feature of weighted averages is that the weights must sum to 1 (they can,
however, be negative, depending on the application). A weighted mean in which the
weights are probabilities is an expected value.
Geometric mean point
Geometric means are most commonly used with rates of return, rates of change over
time, growth rates, and so on. You can substitute R with either of these rates.
One problem we encounter with geometric returns (or one advantage of them over
arithmetic) lies in how we handle negative returns. In using geometric mean returns, we
normally add 1 to each return before taking the product and root and then subtract 1
from the resulting root.
21
Note that the weights must sum to 1. If we have only long positions, the weights must
also all be positive, but with negative positions, the weights could also be negative.
22
23
Median
The largest advantage associated with medians is a lack of sensitivity to extremely large
values (outliers). If you suspect that the large values are a result of mismeasurement in
the data or the inclusion of nonrepresentative units of analysis (sample contamination),
then median is probably a more appropriate measure of centrality than mean. It will
almost always be more appropriate when you have skewed data.
24
25
26
Population variance and sample variance are the two most widely used measures of
dispersion; they have very nice mathematical and statistical properties, particularly in
large samples.
27
28
These measures arise in large part because it is difficult to compare means and standard
deviations across different samples or portfolios. They are both measures of relative
dispersion. Each expresses the magnitude of dispersion with respect to a common point.
In the case of the coefficient of variation, that point is the mean of the observations. In
the case of the Sharpe Ratio, that point is the mean of the returns above a risk‐free
return. BOTH ARE SCALE FREE, and thus provide ease of use in comparing dispersion
among datasets with different distributions.
The Sharpe Ratio plays a prominent role in much of investment analysis, including the
optimization of risky asset allocation in modern portfolio theory (more in Chapter 11). It
is named after William Sharpe, a Nobel prize–winning economist, and is often used as a
portfolio performance measurement tool.
Two cautions in using the Sharpe Ratio: Negative Sharpe Ratios have a counterintuitive
interpretation (increasing risk increases the Sharpe Ratio), so comparisons of negative
and positive Sharpe Ratios should be avoided. The Sharpe Ratio also focuses on only one
measure of risk: standard deviation. It will work well for portfolios with roughly
symmetrical returns, but not so well for portfolios without them, including those with
embedded options. Users of the Sharpe Ratio should ensure that it is an appropriate
tool to assess a specific strategy or manager.
29
30
31
Often known as “higher moments,” skewness (third moment) and kurtosis (fourth
moment; see next slide) both appear in the finance literature, with skewness having a
decidedly larger presence. Kurtosis, the degree of “peakedness,” has a much less
prominent presence.
Skewness captures the degree of symmetry of dispersion around the mean. If a
distribution has significantly more values on one side or the other of the mean, it is said
to be skewed. A distribution with a significantly greater proportion of values close to and
below the mean with a few significant large values high above the mean is said to be
positively skewed (skewed right). A distribution with a significantly greater proportion of
values close to and above the mean with a few large values far below the mean is said to
be negatively skewed (skewed left). The indication of skewness (left/right,
negative/positive) always refers to the “long tail” of the distribution.
32
A leptokurtic distribution (positive excess kurtosis) is more peaked than the normal
distribution. More observations closer to the mean and out in the tails. Often known as
having “fat tails.”
A mesokurtic distribution has peakedness equal to the normal distribution.
A platykurtic distribution (negative excess kurtosis) is less peaked than the normal
distribution. It is more evenly distributed across the range of possible values.
33
The underlying foundation of statistically based quantitative analysis lies with the
concepts of a sample versus a population.
• We use sample statistics to describe the sample and to infer information
about its associated population.
• Descriptive statistics for samples and populations include measures of
centrality and dispersion, such as mean and variance, respectively.
• We can combine traditional measures of return (such as mean) and risk (such
as standard deviation) to measure the combined effects of risk and return
using the Sharpe Ratio.
The normal distribution is of central importance in investments, and as a result, we
often compare statistical properties, such as skewness and kurtosis, with those of the
normal distribution.
34
35
Two key assumptions underlie portfolio theory:
•Investors want to maximize returns for a given level of risk. If an investor is
given a choice of 2 assets with equal levels of risk, they will choose the asset
with the higher expected rate of return. Risk here is defined as the uncertainty
of future outcomes or the probability of an adverse outcome.
•Investors are generally risk adverse. If an investor is given the choice of 2 assets
with equal expected rates of return, then risk aversion results in the investor
selecting the investment with the lower perceived level of risk. There are
investors who might not be risk adverse, although usually some risk aversion is
combined with risk preference.
Historical evidence over the long run has shown that most investors are risk
averse, which means there is a positive relationship between expected return
and expected risk.
36
Mean–variance analysis is the fundamental implementation of modern portfolio theory
•Describes the optimal allocation of assets between risky and risk‐free assets
when the investor knows the expected return and standard deviation of those
assets.
37
Over half a century has passed since Professor Harry Markowitz established the
tenets of mean‐variance analysis, or capital market theory, the focal point of
which is the efficient frontier.
Several assumptions underlie mean‐variance analysis. The assumptions establish
a uniformity of investors, which greatly simplifies the analysis.
38
Mean‐variance analysis is used to identify optimal or efficient portfolios. Before
we can discuss the implications of efficient portfolios, however, we must first be
able to understand and calculate portfolio expected returns and standard
deviations.
39
40
Positively Correlated items tend to move in the same direction
Negatively Correlated items tend to move in opposite directions
Perfectly Positively Correlated describes two positively correlated series having a
correlation coefficient of +1
Perfectly Negatively Correlated describes two negatively correlated series having a
correlation coefficient of ‐1
Uncorrelated describes two series that lack any relationship and have a correlation
coefficient of nearly zero
Assets that are less than perfectly positively correlated tend to offset each others
movements, thus reducing the overall risk in a portfolio
The lower the correlation the more the overall risk in a portfolio is reduced
• Assets with +1 correlation eliminate no risk
• Assets with less than +1 correlation eliminate some risk
• Assets with less than 0 correlation eliminate more risk
41
• Assets with ‐1 correlation eliminate all risk
41
Transcript
Time: 00:00
Transcript: hey guys it's MJ the students actually
00:02
and in this video I want to very quickly
00:04
explain the mean variance portfolio
00:06
theory and I want to give a very quick
00:09
overview so we're not actually going to
00:11
get into the mathematics we're just
00:13
going to get a very high‐level
00:15
understanding now the nice thing about
00:17
finance is that the name kind of gives
00:19
42
it away means you can think of as return
00:22
and variance you can think of as risk
00:25
and portfolio theory think of two assets
00:28
that we want to have in our portfolio
00:31
these can be bonds which are assumed to
00:34
be low risk low return and equity which
00:36
is assumed to be high‐return high risk
00:39
and the general idea was that you could
00:43
have a combination of both of these
00:45
assets so this would be 100% equity this
00:48
would be a hundred percent bonds and
00:50
this yellow line indicates something in
00:51
between and that the risk and return
00:53
relationship would follow this yellow
00:56
line however where the big breakthrough
00:59
was was that this is actually not the
01:02
case the mean variance portfolio theory
01:05
says that the curve does sound actually
01:08
like that it bends the fence and this is
01:11
known as the efficient frontier
01:13
and why are we getting this bencher that
01:17
was the whole thing why they won the
01:19
Nobel Prize and why everyone's like well
01:21
this is such a cool theory because what
42
01:23
they showed was that the variance of the
01:27
portfolio care to the variance of the
01:30
portfolio decreases decreases with a
01:36
thing known as diversification so by
01:41
introducing two different assets and
01:43
important thing here is that they must
01:45
be uncorrelated that the less correlated
01:49
they are the more you're going to get
01:51
this bend so if the answers are very
01:53
correlated you might just get a slight
01:56
Bend and this was the whole idea was
01:58
that by adding in assets that were
02:01
uncorrelated you could reduce risk and
02:05
increase return and this went against
02:08
the whole general understanding of in
02:10
order to take get more return you need
02:12
take on more risk and that's basically
02:15
it so that is it very simply explained
02:17
go check out a whole bunch of other
02:18
videos on YouTube
02:19
to get a more in‐depth understanding of
02:21
the maths behind it but I hope this
02:23
clears it up very quickly for you guys
02:25
42
thanks so much for watching
02:26
Jess
42
43
44
45
46
In this AIA and Prudential example, we calculated the expected return and
standard deviation of one possible combination: 40% in AIA and 60% in
Prudential. An infinite number of combinations of the two stocks are possible,
however. We can plot these combinations on a graph with expected return on
the y‐axis and standard deviation on the x‐axis, commonly referred to as plotting
in risk/return “space”
47
The plot represents all possible expected return and standard deviation
combinations attainable by investing in varying amounts of AIA and PRU.
48
There are several things to notice about this plot.
1) If 100% of the portfolio is allocated to Prudential, the portfolio will have the
expected return and standard deviation of Prudential (ie. Prudential is the
portfolio), and the investment return and risk combination is at the uppermost
end of the curve.
49
2) As the investment in Prudential is decreased and the investment in AIA is
increased, the investment moves down the curve to the point where the
portfolio’s expected return is 15.3% and its standard deviation is 13.6%.
(Labelled as Point C.)
50
Minimum‐variance portfolio
With multiple assets, there is an infinite number of possible weights that can be used to
achieve any specific level of return, but only one set of weights that gives the smallest
possible level of variance for that level of return. This set of weights that forms the
smallest variance among all portfolios is the minimum‐variance portfolio.
Minimum‐variance frontier
Because our investors are risk averse at any fixed level of return, they will prefer to hold
the asset combination at that level of weights. When we determine the set of minimum‐
variance weights for all possible levels of return, we have determined the weights for all
the portfolios on the minimum‐variance frontier.
51
Portfolios on the efficient frontier provide the highest possible level of return for a given
level of risk.
The efficient frontier is an extremely useful portfolio management tool. Once the
investor’s risk tolerance is determined and quantified in terms of variance or standard
deviation, the optimal portfolio for the investor can be easily identified.
52
In mean‐variance analysis, we use the expected returns, variances, and covariances of
individual investment returns to analyse the risk‐return tradeoff of combinations
(portfolios) of individual investments.
The minimum variance frontier is a graph drawn in risk‐return space of the set of
portfolios that have the lowest variance at each level of expected return.
The efficient frontier is the positively sloped portion of the minimum‐variance frontier.
Portfolios on the efficient frontier have the highest expected return at each given level
of risk.
53
Download