Lesson 13 - hedge fund analysis

advertisement
Last Update
11th May 2011
SESSION 39 & 40
Continuous Probability Distributions
Lecturer:
University:
Domain:
Florian Boehlandt
University of Stellenbosch Business School
http://www.hedge-fundanalysis.net/pages/vega.php
Learning Objectives
1. Population and Samples
2. Point Estimates vs. Confidence Interval
Estimates
3. Calculating Confidence Intervals
Normal Probabilities
Often it may be prohibitively expensive to obtain information on
all member of a population. Thus, market researchers usually
collect information from a sample or sub-set of the population.
The sample statistics (e.g. the sample mean) are calculated and
used to estimate the population parameters (e.g. the population
mean). This process is know as statistical inference.
Notation
The notation for sample statistics and population parameters is
given in the table below:
Population
Sample
Parameters
Statistics
Size
N
n
Mean
μ
x
Standard Deviation
σ
s
Proportion
P
p
Inference
A point estimator draws inferences
about the population by estimating the
value of an unknown parameter using
a single value or point
Sample Statistic
Point Estimate =
Sample Statistic
An interval estimator draws
inferences about the population by
estimating the value of an unknown
parameter using an interval
Confidence Interval
Estimate
Unknown Population
Parameter
Common confidence
intervals include:
- 90 %  Weak statistical
evidence
- 95%  Strong statistical
evidence
- 99%  Overwhelming
statistical evidence
Central Limit Theorem
The sampling distribution of the mean of a random sample
drawn from any population is approximately normal for
sufficiently large sample sizes. The larger the sample size, the
more closely the sampling distribution of x-bar will resemble the
normal distribution.
This is an important notation since it allows for using the normal
distribution to describe the dispersion of sample means.
Example: Tossing n dies and recording the average results
Sampling Distribution
It can be shown that the sampling distribution is described as
follows:
If X is normal. X-bar is normal. If X is nonnormal, X-bar is
approximately normal for sufficiently large sample sizes.
So for the sampling distribution:
Changes to:
Example
Suppose that the amount of time to assemble a computer is
normally distributed with a mean μ = 50 minutes and a standard
deviation σ = 10 minutes.
a) What is the probability that one randomly selected computer
is assembled in a time less than 60 minutes?
b) What is the probability that four randomly selected
computers have a mean assembly time of less than 60
minutes?
Solution
a)
b)
60
60
50
50
10
10
1
4
The associated probabilities are P(Z < 1) = 0.8413 and P(Z < 2) =
0.9772 respectively.
Sampling Distribution and
Inference
The 95% confidence interval (i.e. the area underneath the graph)
for the standard normal distribution is expressed algebraically:
With the definition of Z for the sampling distribution:
Rearrangement yields:
Or for the general case:
The smaller-than term is
referred to as LowerConfidence-Limit (LCL) and the
larger-than term as UpperConfidence-Limit (UCL)
Example
Suppose that the average assembly time across n = 25
computers is X-bar = 50 minutes. In addition, we assume that the
population standard deviation is known and is equal to σ = 10
minutes. What is the 95% confidence interval?
Comment: α = 1 – CL. Here, α = 1 – 0.95 = 0.05 (or 5%). Thus, α/2
= 0.025.
Solution
LCL and UCL
1.96
50
10
25
Thus, the LCL = 53.92 and the UCL = 46.08. The interpretation is
straight-forward: For n = 25 with σ = 10, there is a 95% chance that the
true population mean μ falls in between the LCL = 53.92 and the UCL =
46.08.
Finding zα/2
Since CL = 0.95, α = 1 – 0.92 = 0.05. Then α / 2 = 0.025. For one half of
the standard normal distribution table, this corresponds to 0.5 – 0.025 =
0.4750 = P(Z < 1.96). Thus, zα/2 = 1.96.
Z
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
0
0.3413
0.3643
0.3849
0.4032
0.4192
0.4332
0.4452
0.4554
0.4641
0.4713
0.4772
0.01
0.3438
0.3665
0.3869
0.4049
0.4207
0.4345
0.4463
0.4564
0.4649
0.4719
0.4778
Represents 2.5% of the area
underneath the chart
0.02
0.3461
0.3686
0.3888
0.4066
0.4222
0.4357
0.4474
0.4573
0.4656
0.4726
0.4783
0.03
0.3485
0.3708
0.3907
0.4082
0.4236
0.4370
0.4484
0.4582
0.4664
0.4732
0.4788
0.04
0.3508
0.3729
0.3925
0.4099
0.4251
0.4382
0.4495
0.4591
0.4671
0.4738
0.4793
0.05
0.3531
0.3749
0.3944
0.4115
0.4265
0.4394
0.4505
0.4599
0.4678
0.4744
0.4798
0.06
0.3554
0.3770
0.3962
0.4131
0.4279
0.4406
0.4515
0.4608
0.4686
0.4750
0.4803
0.07
0.3577
0.3790
0.3980
0.4147
0.4292
0.4418
0.4525
0.4616
0.4693
0.4756
0.4808
Normal Approximation of the
Binomial Distribution
The binomial distribution may be approximated using the normal
distribution. A graphical derivation of this is included in most
statistics textbooks and is omitted here. The upside is that the
normal approximation allows us to calculate confidence intervals
for the binomial distribution It can be shown that the sampling
distribution is described as follows:
where p-hat is the proportion of successes in a Bernoulli trial
process estimated from the statistical sample.
Confidence Interval Binomial
Distribution
Replacing E(P-hat) for μ and the standard error σ / √n with the
standard error of the proportion in the formula for the
confidence interval yields:
Example
In a survey including 1000 people, a political candidate received
52% of the votes cast. What is the 95% confidence interval
associated with this result?
Solution
LCL and UCL
1.96
0.52
0.48
1000
Thus, the LCL = 0.504 and the UCL = 0.536. Note that the LCL is in
excess of 0.5 (i.e. from the sample, there is strong evidence to infer that
the candidate may win the election).
Download