Random variable, outcome, and event Subjective Interest rate Empirical (observation) Types of probability Objective PV and FV of cash flows Priori (logical analysis) Probability and odd ratios PV of unequal cash flows Time value of money Odds for/against Non-annual & continuous compounding Addition rule Solving for I/Y, CAGR, PMT, N Conditional and joint probability Multiplication rule Additivity principle Total probability rule Nominal (non-ranked) Probability Concepts Qualitative (categorical) Ordinal (ranked) Conditional expected value and variance Expected value and variance Tree diagram Discrete Continuous Quantitative (numerical) Covariance given a joint probability function Data types Cross-sectional / Time Series / Panel data Bayes' formula Structured (numbers) / Unstructured (text, audio, video, images) Organizing data Combination (tổ hợp) Principles of counting Frequency distributions Permutation (chỉnh hợp) Contingency table Discrete => Probability mass function: p(x) = P(X = x) Visualization Random variables Continuous => Probability density function (pdf): f(x) = P(X = x) Sample mean (arithmetic) Cumulative distribution function (cdf): P(X ≤ x); F(x) = P(X ≤ x) Median Discrete and continuous uniform Mode Bernoulli trial/random variable Binomial Outliers Binomial random variable (x): number of successes in Bernoulli trial Probability of 'x' successes in 'n' trials Organizing, Visualizing, and Describing Data Winsorized mean Univariate/Multivariate distribution Other means Geometric mean 'n' Means Harmonic mean Multivariate normal distribution for 'n' variables Common Probability Distributions Weighted mean Normal 'n' Variances 'n(n - 1)/2' Pairwise correlations Trimmed mean Suitable for modelling quarterly/yearly returns; NOT suitable for modelling asset prices Winsorized mean Standardizing a random variable Sample variance and standard deviation Measures of Central Tendency & Dispersion Modern portfolio theory (MPT): the value of investment opportunities can be meaningfully measured in terms of mean return and variance of return. Range Application of normal distribution Mean absolute deviation (MAD) Shortfall risk: The risk that portfolio value or portfolio return will fall below some minimum acceptable level over some time horizon. Sample variance and standard deviation Quantiles Interquartile range (IQR) Roy's Safety-first Ratio L = (n+1)y/100 Location (L) of the 'y'th percentile with 'n' data entries Quartiles / Quintiles / Deciles / Percentiles Stress testing: A specific type of scenario analysis that estimates losses in rare and extremely unfavorable combinations of events or scenarios. Box and whisker plot Scenario analysis: A technique for exploring the performance and risk of investment strategies in different structural regimes. Suitable for modelling asset prices Target downside deviation (target semi- deviation) and coefficient of variation Mean (μL) of a lognormal random variable = e^(μ + 0.5σ^2) Lognormal distribution Variance (σL^2) of a lognormal random variable = e^(2μ + σ^2) × [e^(σ^2) − 1] Suitable for modelling asset prices Shape of distributions Formula Continuous compounded rate of return Skewness Normal the continuously compounded return to time T is the sum of the one-period continuously compounded returns Negatively (left) skewed Positively (right) skewed Volatility measures the standard deviation of the continuously compounded returns on the underlying asset (by convention, it is stated as an annualized measure typically done on the basis of 250 days in a year - the approximate number of days markets are open for trading). Kurtosis Correlation Quantitative methods t-test, Chi square, F-test t-distribution has fatter tails than normal distribution => more reliable and conservative downside risk estimate Chi-square and F-test are bounded below by 0 Monte Carlo simulation Strengths: price complex securities for which no analytic expression is available Weaknesses: Provides only statistical estimates, not exact results. Analytic methods, where available, provide more insights into cause-effect relationship. Convenience Small scale pilot studies Non-probability Judgmental Auditing Simple random Probability Systematic: selecting every 'k'th observation Process of hypothesis testing Stratified random: population is divided into strata based on classification criteria; simple random samples are then drawn from each stratum proportionally to the relative size of each stratum in the population to form a large sample. Appropriate test statistics Sampling methods Level of significance Cluster: divides a population into clusters representative of the population and then randomly draws certain clusters to form a sample. Relatively less accurate but more time-efficient and cost-efficient 1-tailed test Decision rule 2-tailed test Sampling error: difference between the observed value of a statistic and the quantity it is intended to estimate as a result of sampling. Sampling and Estimation Sampling Distribution of a Statistic: the distribution of all the distinct possible values that the statistic can assume when computed from samples of the same size randomly drawn from the same population. Hypothesis Testing Making a decision Statistical significance ≠ economic significance Central limit theorem (CLT) & Distribution of the sample mean p-value below significance level => null is rejected CLT: Given a population described by any probability distribution having mean μ and finite variance σ^2, the sampling distribution of the sample mean computed from random samples from this population will be approximately normal with mean μ (the population mean) and variance σ^2/n (the population variance divided by n) when the sample size n is large (n ≥ 30), regardless of the population's distribution. Standard error of the sample mean: how much inaccuracy of a population parameter estimate comes from sampling. p-value p-value is the smallest value where the null can be rejected => then the smaller the p-value, the more likely null is rejected Unbiased: one whose expected value (the mean of its sampling distribution) equals the parameter it is intended to estimate. Point estimates of population mean Efficient: an unbiased estimator is efficient if no other unbiased estimator of the same parameter has a sampling distribution with smaller variance. False discovery approach: An adjustment in the p-values for tests performed multiple times. Consistent: one for which the probability of estimates getting close to the value of the population parameter increases as sample size increases. False discovery rate (FDR): The rate of Type I errors in testing a null hypothesis multiple times for a given level of significance. Multiple Tests and Significance Interpretation z-statistics Multiple testing problem: The risk of getting statistically significant test results when performing a test multiple times. If you run 100 tests and use a 5% level of significance, you get 5 false positives, on average. Confidence intervals = Point estimate ± Reliability factor × Standard error t-statistics Confidence Intervals for the Population Mean and Sample Size Selection Sample size selection Risk of sampling from more than one population and additional expenses of different sample sizes. Tests for a single mean With independent samples Bootstrap: repeatedly draws samples with replacement of the selected elements from the original observed sample, treating the original sample as a new population (often used to find standard error or construct confidence intervals of population parameters). Tests for differences between means With dependent samples Resampling Single variance Jackknife: repeatedly draws samples by taking the original observed data sample and leaving out one observation at a time (without replacement) from the set. Tests of variances 2 variances Data snooping: determining a model by extensive searching through a dataset for statistically significant patterns. When the data we use do NOT meet distributional assumptions. When there are outliers. Usage Sampling biases Parametric vs Non-parametric tests When the data are given in ranks or use an ordinal scale. When the hypotheses we are addressing do NOT concern a parameter. out-of-sample test Sample selection bias: Bias introduced by systematically excluding some members of the population according to a particular attribute—for example, the bias introduced when data availability leads to certain observations being excluded from the analysis. implicit selection bias: selection bias introduced through the presence of a threshold that filters out some unqualified members. Survivorship bias: The exclusion of poorly performing or defunct companies from an index or database, biasing the index or database toward financially healthy companies. However, if the assumptions of the parametric test are met, the parametric test is preferred because it may have a greater ability to reject a false null hypothesis. Backfill bias: certain surviving hedge funds are added to databases and various hedge fund indexes only after they are initially successful and start to report their returns. Pearson correlation Tests of correlation Look-ahead bias: bias caused by using information that was unavailable on the test date. Spearman rank correlation coefficient Time-period bias: statistical conclusion may be sensitive to the starting and ending dates of the sample. Tests of independence using contingency table Intercept, slope coefficient, and error term Sum of squared errors Estimating parameters Measures of variation Simple linear regression (SLR) Linearity Homoskedasticity Introduction to Linear Regression Assumptions Independence Normality Sum of squared errors (SSE): measures variation in observed values NOT attributable to the relationship between the dependent and independent variables Total sum of squares (SST): measures variation of observed values around the mean Sum of squared regressions (SSR): measures variation in observed values attributable to the relationship between the dependent and independent variables Analysis of variance (ANOVA) Coefficient of determination (R-squared) Standard error of the estimate Slope coefficient Hypothesis testing of linear regression coefficients Intercept Prediction using SLR Log-lin Lin-log Functional forms of SLR Log-log: useful in calculating elasticities because the slope coefficient is the relative change in the dependent variable for a relative change in the independent variable. Selecting functional forms: examining the goodness of fit measures (R-squared, F-statistic, and the standard error of the estimate), and whether there are patterns in the residuals.