:is a rate of return that reflects the relationship between differently dated meaning cash flows Interest rate formula Timeline Formula PV and FV Lump-sum : single CF Concept Type of cash flow Annuity : a series of equal CF that occurs at evenly space interval Perpetuity : infinte lives Unequal CF Ordinary Annuity ; Time Annuity = Ordinary x (1 + R) Due Continuous when, how? Formula Annually Payment frequency Semi-annual Discrete Quarterly Monthly etc.. How to compare and choose which term? Effective Annual Rate : EAR Annualized and easy to compare Formula No arbitrage opportunities concept Find PV Find FV Calculator Find I/Y Find PMT Find N Amortization table Time index Projected perpetuity CF at T<>0 Aggregated CF Rate of compound growth Other applications Number of periods for specific growth Funding a future obligation Investment decision NPV IRR Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy Why do we learn? Application? Regression equation from sample: Y=aX+b Level 1 and Level 2 (Major) Correlation X and Y? Need Data? How to organize and visualize How to describe How to estimate Probability concept Probability distribution Sampling and estimation Level 1 Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy Hypothesis testing aka Numberical data Quantitative data Discrete : countable Continuous :fractional value How data is analyzed aka Categorical data Qualitative data Nominal :are labels that cannot be placed in order logically Ordinal : can be logically ordered or ranked Data types Cross sectional :multiple observational units at a given point in time How data is collected Time series : a single observational unit of a specific variable collected over time Panel how data form is organized a mix of 2 types Structured : organized in a pre-defined manner Unstructured : do not follow any conventionally organized forms 1 dimension table Dimension 2 dimension table Absolute frequency Organizing data for quantitative analysis Types Relative frequency Cumulative absolute frequency Cumulative relative frequency Count by interval use bin Frequency Contingency table Joint frequencies Marginal frequencies Histogram and Frequency polygon Single bar Bar chart Grouped bar : show joint frequencies Clustered bar Tree map Visualizing Word cloud visualizing the joint variation in two numerical variables Scatter plot Heat map Dependent variable Independent variable :organizes and summarizes data in a tabular format and represents them using a color spectrum Descriptive statistics Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy Data types Organizing data for quantitative analysis Visualizing vs Population mean vs Sample mean Weighted mean (portfolio return) Trimmed mean Arithmetic mean 5% trimmed mean = arithmetic mean of remaining value from 2.5th percentile to 97.5th percentile Mean Winsorized mean Geometric mean Geometric mean return Harmonic mean Obs = 2n+1 Median Obs = 2n No mode Mode most frequently in a data set use 97.5th percentile i/o any value >= 97.5th percentile and remaining value Measures of central tendency the midpoint of a data set when the data is arranged in ascending or descending order use 2.5th percentile i/o any value <= 2.5th percentile 95% winsorized data = :all the values in a dataset are different Unimodal, bimodal, trimodal Model interval : has a single/2/3 value that is most frequently occurring : interval (possibly more than one) with the highest frequency Quartile (4) Quintile (5) Measure of location Quantile Decile (10) Location of p-th percentile = (Obs + 1) x p% Percentile (100) Interpolate or extrapolate Application: Calculate VaR 1 variable Range MAD Absolute Variance & Standard deviation Measure of dispersion Formula Population Sample (use n-1) Downside risk Derivatives Target semi deviation Coefficient of Variation Descriptive statistics Relative Sharpe ratio SFRatio Symmetrical Skewness Asymmetrical Shape of distribution Kurtosis Covariance 2 variables Skewed to the left ~ negative skew ~ skewness < 0 Skewed to the right ~ positive skew ~ skewness > 0 Excess kurtosis Formula Formula Correlation Limitations Non-linear? Spurious correlation? Outliers unreliable measure Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy Random variables :is a quantity whose future outcomes are uncertain. Outcome : is a possible value of a random variable Terms Mutually exclusive events :means that only one event can Event occur at a time Exhaustive events :the events cover all possible outcomes is a specified set of outcomes. properties of probability Probability 0<=P(E)<=1 sum of all P(E) =1, if all events is mutually exclusive & exhaustive Subjective Prob. Types Objective Prob. Odds for E = P(E)/[1 − P(E)] Odds against E = [1 − P(E)]/P(E) Odds for & odds against personal judgment Empirical Priori Past data logical analysis convert Prob to Odds vice versa Unconditional probabilities : P(A) Tree diagram Conditional probabilities : P(A | B) Joint probability : P(AB) (both A and B happening) Case Multiplication rule : P(AB) = P(A | B)P(B) Rules Addition rule : P(A or B) = P(A) + P(B) – P(AB) Total probability rule A or B occurs, or both occur Expected mean and variance with prob. Port. return Application in Portfolio management Port. variance and standard deviation Calculate covariance given a joint probability funtion Bayes's formula Factorial n! = n(n – 1)(n – 2)(n – 3)…1 Counting Combination Permutaion Labeling Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy : take on at most a countable number of possible values Random variables Probability distribution Terms Functions : describes the probabilities of all the possible outcomes for a random variable specifies the probability that the random variable Discrete Probability function P(X=a) takes on a specific value Continuous PDF - Probability density function f(a) CDF - Cumulative distribution function F(a)=P(X<=a) Discrete uniform Uniform distribution p(x) = 1/n Continuous uniform Binomial distribution Bernoulli ; Univariate distribution vs. distribution of a single random variable Multivariate distribution probabilities associated with a group of random variables meaningful only when the behavior of each random variable in the group is in some way dependent on the behavior of the others 50% -> 2/3 Standard normal distribution Confidence intervals Normal distribution 68% -> 1 95% -> 2 or 1.96 99% -> 3 or 2.58 Standardizing a random variable Applications Safety-first rule Z(-x) = 1- Z(x) Shortfall risk = Probability that (return < min acceptable level) SFRatio = Continuously compounded Lognormal distribution ; symmetrical Student's t-distribution less kurtosis, fatter tail than normal degrees of freedom : n – 1 1. is asymmetrical 2. this distribution does not take on negative values Chi-square distribution 3. k degrees of freedom Exhibit 16. CFA curriculums 2022 2. this distribution does not take on negative values 1. is asymmetrical 3. F-distribution with m numerator and n denominator F-distribution Application Exhibit 16. CFA curriculums 2022 Monte Carlo simulation based on the repeated generation of one or more risk factors that affect security values, in order to generate a distribution of security values Value complex securities; Simulate the PnL; Estimate VaR; Value portfolios of assets that have nonnormal returns Pros distributions Cons Complex; GIGO; provides only statistical estimates, not exact results; no provide more insight into cause-and-effect relationships Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy : each element of the population has an equal probability of being selected to the subset select every kth member until Simple random sample if we cannot code (or even identify) all the members of a population we have a sample of the Systematic sample (0) assuming that each subset (cluster) is representative desired size Probability sampling :population is divided into of the overall population with respect Stratified random sample Sampling subpopulations (strata) based to the item we are sampling on criteria; then drawn from methods Cluster sample Sampling each stratum in sizes (1) Convenience sample proportional to the relative size Non probability sampling (1) selected if it is accessible to a researcher or on how of each stratum in the (2) Judgmental sample easy it is for a researcher to access the element population Sampling error : diff between statistic of sample and population (2) based on a researcher’s knowledge and : is a probability distribution of all possible sample statistics Sampling distribution professional judgment computed from a set of equal-size samples Population vs sample Sample mean distribution Unbiasedness of the sample mean Central limit theorem or Standard error of the sample mean Point estimation ~E(X) Estimate population mean Confidence interval estimation Degree of confidence vs significant level Known population variance use Normal dist. Unknown population variance 3 desirable properties of an estimator use Student's t-dist when n<30 use both when n>=30 when expected value (the mean of its sampling Unbiasedness distribution) = parameter intended to estimate. Efficiency small variance Consistency increase as sample size increase repeatedly draws samples from the (the identical element is put back into the group so that it original observed data sample can be drawn more than once) Bootstrap Resampling method Exhibit 18 CFA curriculum 2022 each data set = original sample - 1 obs Jackknife Statistical significance of the pattern is overestimated because the ressult were found through data snooping When data availability leads to certain obs being excluded from the analysis, we call the resulting problem sample selection bias reduce bias of estimator sample size = n -> n repetitions Data snooping bias out of sample test to reduce Sample selection bias If sample size is too big More info cost Possible to pick obs from other populations Survivorship bias : contain only survivor observations Bias Implicit selection bias : a threshold enabling self-selection Backfilled bias :fund’s past performance may be backfilled into the index’s database Look-ahead bias Time-period bias using sample data that was not available on the test date. if the time period over which the data is gathered is either too short or too long. Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy Objectives to address questions: = ,>, < Null hypothesis H0: = ; >=; <= 1. State the hypotheses Alternative hypothesis H1: <> , < , > 2 sided & 1 sided hypotheses Student's t 2. Identify the appropriate test statistic Distribution of test statistic Chi square F Find significant level 3. Specifying significance level Process Type I Errors Type II (beta) do not reject null when it's false The power of a test = 1-beta Determine critical value 4. State the decision rule Decision rule 5. Collect data and calculate test statistic 6. Make a decision (alpha) reject null when it's true Formula Statistical decision Economic decision How to use : ~minimum alpha to reject Null; estimate from test statistic P-value Compare with significance level Parametric test :concerned with parameters (ie mean and variance) Type Non parametric test Single mean Mean (1) when the data we use do not meet distributional assumptions, (2) when there are outliers, (3) when the data are given in ranks or use an ordinal scale, or (4) when the hypotheses we are addressing do not concern a parameter. Differences between means with independent samples : test different of mean Differences between means with dependent samples : test mean of diff/ paired comparion Test Variance Correlation Single variance test : Chi Equality of 2 variances with independent samples :F-test Parametric test Spearman rank correlation coefficient : When we believe that the population under consideration meaningfully departs from normality Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy The residual term is independently distributed; that is, the residual for one observation is not correlated with that of another observation. Linearity Homoskedasticity The variance of the residual term is constant for all Assumptions Independence observations Normality :residuals be normally distributed Regression line Cross sectional and time-seri regression Regression equation Dependent variable Independent variable Regression coefficient : b1 Intercept : b0 Error term : e SST Sum of squares = SSR + SSE SSR SSE Standard error of estimate (min is good) ANOVA MSR Mean squares MSE SEE degree of freedom the percentage of the total variation in the dependent variable explained by the independent variable F-test Measure of model fit R-squared Coefficient of determination Meaning? Limitation Test of correlation Test of slope coefficient Hypothesis tests Standard error of slope coefficient Test of slope for dummy variable Test of intercept Point prediction Prediction Interval prediction Standard error of the forecast Formula Log-Lin model Functional form Y-X Lin-Log model Log-Log model Soạn thảo vào tháng 7/2021 bởi Quân Phạm. Người dùng được phép sửa đổi, sao chép để học tập và giảng dạy