Uploaded by nancy.kimngoctran2010

Quant cfa

advertisement
:is a rate of return that reflects the relationship between differently dated
meaning cash flows
Interest rate
formula
Timeline
Formula
PV and FV
Lump-sum : single CF
Concept
Type of cash flow
Annuity : a series of equal CF that occurs at evenly space interval
Perpetuity : infinte lives
Unequal CF
Ordinary
Annuity
;
Time
Annuity
= Ordinary x (1 + R)
Due
Continuous
when, how?
Formula
Annually
Payment frequency
Semi-annual
Discrete
Quarterly
Monthly
etc..
How to
compare and
choose which
term?
Effective Annual Rate :
EAR
Annualized and easy to compare
Formula
No arbitrage opportunities concept
Find PV
Find FV
Calculator
Find I/Y
Find PMT
Find N
Amortization table
Time index
Projected perpetuity CF at T<>0
Aggregated CF
Rate of compound growth
Other
applications
Number of periods for specific growth
Funding a future obligation
Investment decision
NPV
IRR
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
Why do we learn?
Application?
Regression equation from sample:
Y=aX+b
Level 1 and Level 2 (Major)
Correlation X and Y?
Need Data?
How to organize
and visualize
How to describe
How to estimate
Probability concept
Probability distribution
Sampling and estimation
Level 1
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
Hypothesis testing
aka Numberical data
Quantitative data
Discrete : countable
Continuous :fractional value
How data is analyzed
aka Categorical data
Qualitative data
Nominal :are labels that cannot be placed in order logically
Ordinal : can be logically ordered or ranked
Data types
Cross sectional :multiple observational units at a given point in time
How data is collected
Time series : a single observational unit of a specific variable collected over time
Panel
how data form is organized
a mix of 2 types
Structured : organized in a pre-defined manner
Unstructured : do not follow any conventionally organized forms
1 dimension table
Dimension
2 dimension table
Absolute frequency
Organizing
data for
quantitative
analysis
Types
Relative frequency
Cumulative absolute frequency
Cumulative relative frequency
Count by interval
use bin
Frequency
Contingency table
Joint frequencies
Marginal frequencies
Histogram and Frequency polygon
Single bar
Bar chart
Grouped bar : show joint frequencies
Clustered bar
Tree map
Visualizing
Word cloud
visualizing the joint variation in two numerical
variables
Scatter plot
Heat map
Dependent variable
Independent variable
:organizes and summarizes data in a tabular
format and represents them using a color
spectrum
Descriptive
statistics
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
Data types
Organizing
data for
quantitative
analysis
Visualizing
vs
Population mean vs Sample mean
Weighted mean
(portfolio return)
Trimmed
mean
Arithmetic
mean
5% trimmed mean = arithmetic mean of remaining
value from 2.5th percentile to 97.5th percentile
Mean
Winsorized
mean
Geometric mean
Geometric mean return
Harmonic mean
Obs = 2n+1
Median
Obs = 2n
No mode
Mode
most frequently in a data set
use 97.5th percentile i/o any
value >= 97.5th percentile
and remaining value
Measures
of central
tendency
the midpoint of a data set when the data
is arranged in ascending or
descending order
use 2.5th percentile i/o any
value <= 2.5th percentile
95%
winsorized
data =
:all the values in a dataset are different
Unimodal, bimodal, trimodal
Model interval
: has a single/2/3 value that is most frequently occurring
: interval (possibly more than one) with the highest frequency
Quartile (4)
Quintile (5)
Measure of location
Quantile
Decile (10)
Location of p-th percentile = (Obs + 1) x p%
Percentile (100)
Interpolate or extrapolate
Application: Calculate VaR
1 variable
Range
MAD
Absolute
Variance &
Standard
deviation
Measure of dispersion
Formula
Population
Sample (use n-1)
Downside risk
Derivatives
Target semi
deviation
Coefficient of Variation
Descriptive
statistics
Relative
Sharpe ratio
SFRatio
Symmetrical
Skewness
Asymmetrical
Shape of distribution
Kurtosis
Covariance
2 variables
Skewed to the left
~ negative skew ~ skewness < 0
Skewed to the right
~ positive skew ~ skewness > 0
Excess kurtosis
Formula
Formula
Correlation
Limitations
Non-linear?
Spurious correlation?
Outliers
unreliable measure
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
Random variables :is a quantity whose future outcomes are uncertain.
Outcome : is a possible value of a random variable
Terms
Mutually exclusive events :means that only one event can
Event
occur at a time
Exhaustive events :the events cover all possible outcomes
is a specified set of outcomes.
properties of
probability
Probability
0<=P(E)<=1
sum of all P(E) =1, if all events is
mutually exclusive & exhaustive
Subjective Prob.
Types
Objective Prob.
Odds for E = P(E)/[1 − P(E)]
Odds against E = [1 − P(E)]/P(E)
Odds for & odds against
personal judgment
Empirical
Priori
Past data
logical analysis
convert Prob to Odds
vice versa
Unconditional probabilities : P(A)
Tree diagram
Conditional probabilities : P(A | B)
Joint probability : P(AB) (both A and B happening)
Case
Multiplication rule : P(AB) = P(A | B)P(B)
Rules
Addition rule : P(A or B) = P(A) + P(B) – P(AB)
Total probability rule
A or B occurs, or both occur
Expected mean and variance with prob.
Port. return
Application in Portfolio management
Port. variance and
standard deviation
Calculate covariance given
a joint probability funtion
Bayes's formula
Factorial n! = n(n – 1)(n – 2)(n – 3)…1
Counting
Combination
Permutaion
Labeling
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
: take on at most a countable number of possible
values
Random variables
Probability distribution
Terms
Functions
: describes the probabilities of all the possible outcomes for a random variable
specifies the probability that the random variable
Discrete
Probability function P(X=a)
takes on a specific value
Continuous
PDF - Probability density function f(a)
CDF - Cumulative distribution function F(a)=P(X<=a)
Discrete uniform
Uniform distribution
p(x) = 1/n
Continuous uniform
Binomial distribution
Bernoulli
;
Univariate distribution vs. distribution of a single random variable
Multivariate distribution probabilities associated with a group of random variables
meaningful only when the behavior of each random variable in
the group is in some way dependent on the behavior of the others
50% -> 2/3
Standard normal distribution
Confidence intervals
Normal distribution
68% -> 1
95% -> 2 or 1.96
99% -> 3 or 2.58
Standardizing a random variable
Applications
Safety-first rule
Z(-x) = 1- Z(x)
Shortfall risk = Probability that (return < min
acceptable level)
SFRatio =
Continuously compounded
Lognormal distribution
;
symmetrical
Student's t-distribution
less kurtosis, fatter tail than normal
degrees of freedom : n – 1
1. is asymmetrical
2. this distribution does not take on negative values
Chi-square distribution
3. k degrees of freedom
Exhibit 16. CFA curriculums 2022
2. this distribution does not take on negative values
1. is asymmetrical
3. F-distribution with m numerator and n
denominator
F-distribution
Application
Exhibit 16. CFA curriculums 2022
Monte Carlo simulation
based on the repeated generation of one or more risk factors that affect
security values, in order to generate a distribution of security
values
Value complex securities; Simulate the PnL; Estimate VaR;
Value portfolios of assets that have nonnormal returns
Pros distributions
Cons Complex; GIGO; provides only statistical estimates, not
exact results; no provide more insight into cause-and-effect
relationships
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
: each element of the population has an equal probability
of being selected to the subset
select every kth member until
Simple random sample
if we cannot code (or even identify) all the members of a population
we have a sample of the
Systematic sample
(0) assuming that each subset (cluster) is representative
desired size
Probability sampling
:population is divided into
of the overall population with respect
Stratified random sample
Sampling
subpopulations (strata) based
to the item we are sampling
on criteria; then drawn from
methods
Cluster sample
Sampling
each stratum in sizes
(1) Convenience sample
proportional to the relative size
Non probability sampling
(1) selected if it is accessible to a researcher or on how
of each stratum in the
(2) Judgmental sample
easy it is for a researcher to access the element
population
Sampling error : diff between statistic of sample and population
(2) based on a researcher’s knowledge and
: is a probability distribution of all possible sample statistics
Sampling distribution
professional judgment
computed from a set of equal-size samples
Population vs sample
Sample mean distribution
Unbiasedness of the sample mean
Central limit theorem
or
Standard error of the sample mean
Point estimation ~E(X)
Estimate
population
mean
Confidence
interval estimation
Degree of confidence vs significant level
Known population variance
use Normal dist.
Unknown population variance
3 desirable properties
of an estimator
use Student's t-dist when n<30
use both when n>=30
when expected value (the mean of its sampling
Unbiasedness
distribution) = parameter intended to estimate.
Efficiency small variance
Consistency increase as sample size increase
repeatedly draws samples from the (the identical element is put
back into the group so that it
original observed data sample
can be drawn more than
once)
Bootstrap
Resampling
method
Exhibit 18 CFA curriculum 2022
each data set = original sample - 1 obs
Jackknife
Statistical significance of the pattern is overestimated
because the ressult were found through data snooping
When data availability leads to certain obs being
excluded from the analysis, we call the resulting
problem sample selection bias
reduce bias of estimator
sample size = n -> n repetitions
Data snooping bias
out of sample test to reduce
Sample selection bias
If sample size is too big
More info cost
Possible to pick obs from other populations
Survivorship bias : contain only survivor observations
Bias
Implicit selection bias : a threshold enabling self-selection
Backfilled bias :fund’s past performance may be backfilled into the index’s database
Look-ahead bias
Time-period bias
using sample data that was not available on the test date.
if the time period over which the data is gathered is either too short or too long.
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
Objectives to address questions: = ,>, <
Null hypothesis H0: = ; >=; <=
1. State the hypotheses
Alternative hypothesis H1: <> , < , >
2 sided & 1 sided hypotheses
Student's t
2. Identify the appropriate test statistic
Distribution of test statistic
Chi square
F
Find significant level
3. Specifying significance level
Process
Type I
Errors
Type II
(beta) do not reject null when it's false
The power of a test
= 1-beta
Determine critical value
4. State the decision rule
Decision rule
5. Collect data and calculate test statistic
6. Make a decision
(alpha) reject null when it's true
Formula
Statistical decision
Economic decision
How to use : ~minimum alpha to reject Null; estimate from test statistic
P-value
Compare with significance level
Parametric test :concerned with parameters (ie mean and variance)
Type
Non parametric test
Single mean
Mean
(1) when the data we use do not meet distributional assumptions, (2) when
there are outliers, (3) when the data are given in ranks or use an ordinal scale,
or (4) when the hypotheses we are addressing do not concern a parameter.
Differences between means with independent
samples : test different of mean
Differences between means with dependent samples : test mean of diff/ paired comparion
Test
Variance
Correlation
Single variance
test
: Chi
Equality of 2 variances with independent samples :F-test
Parametric test
Spearman rank correlation coefficient : When we believe that the population under
consideration meaningfully departs from
normality
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
The residual term is independently distributed; that is, the
residual for one observation is not correlated with that
of another observation.
Linearity
Homoskedasticity The variance of the residual term is constant for all
Assumptions
Independence
observations
Normality :residuals be normally distributed
Regression line
Cross sectional and time-seri regression
Regression equation
Dependent variable
Independent variable
Regression coefficient : b1
Intercept : b0
Error term : e
SST
Sum of squares
= SSR + SSE
SSR
SSE
Standard error of
estimate (min is good)
ANOVA
MSR
Mean squares
MSE
SEE
degree of freedom
the percentage of the total variation in the dependent
variable explained by the independent variable
F-test
Measure of model fit
R-squared
Coefficient of determination
Meaning?
Limitation
Test of correlation
Test of slope coefficient
Hypothesis tests
Standard error of slope coefficient
Test of slope for dummy variable
Test of intercept
Point prediction
Prediction
Interval prediction
Standard error of the forecast
Formula
Log-Lin model
Functional form
Y-X
Lin-Log model
Log-Log model
Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy
Download