Lecture 01, 02 1. Statistics Descriptive statistics orginize summarize } data?collect?n,N? present • The average number of students in a class at White Oak University is 22.6 • Last year’s total attendance at Long Run High School’s football games was 8235 2. Types of Variable Qualitative (Categorical) Nominal Ordinal Incompareable Compareable Cannot calculated Name, address, career,.. Rank, Size, … Binary: 2 values; male/female; yes/no; … 3. Level of measurement Qualitative (Categorical) Nominal Ordinal Names/label Nominal + rank/order Inferential statistics predict forecast} uncertain verify • A recent study showed that eating garlic can lower blood pressure • It is predicted that the average number of automobiles each household owns will increase next year Quantitative (Scale) Discrete Countable Calculate: +, Score, The number of … Continuous Uncountable Calculate: +, -, x, :, … Height, Temperature, … Quantitative (Scale) Interval: Ordinal + interval between 2 steps is equal Score, Temperatue, … Ratio: Interval + ratio of 2 values is meaningful Height, Time, … 4. Tabular Qualitative (Categorical): nominal / ordinal • Frequency distribution tabular • Relative freq. dis. • Percent freq. dis. (%) • Cross-tabular / row and column … Quantitative (Scale) • Frequency distribution tabular • Relative freq. dis. / Percent freq. dis. (%) • Cummulative freq. dis. • Cummulative relative freq. dis. • Cross-tabular / row and column … 5. Graphics Qualitative (Categorical) • Pie chart • Bar/column chart • Clustered bar/column chart • Stacked bar/column chart Quantitative (Scale) • Bar/column chart • Clustered bar/column chart • Stacked bar/column chart • Group chart • Histogram / Histogram and cummulative • Dot Plot; Line; … • Scatter Plot (→ Correlation) • Bubble chart 1 6. Data Population (π₯1 , π₯2 , … , π₯π ) Listed Frequency tabular Value Freq. Grouped data Sample (π₯1 , π₯2 , … , π₯π ) π₯1 π1 π₯2 π2 π0 − π1 π0 + π1 2 Freq. π1 7. Measures of Locations Value π₯π Mean … … π1 − π2 π1 + π2 2 π2 Median Mode Value Freq. ππΎ−1 − ππΎ ππΎ−1 + ππΎ 2 ππΎ Value π0 − π1 π1 − π2 … ππΎ−1 − ππ π0 + π1 π1 + π2 … ππΎ−1 + ππ π₯π 2 2 2 Freq. π1 π2 … ππ … … … π₯1 π1 Arithmetic mean: π΄πππ = βͺ βͺ Same unit with X “Sensitive” to any change in element’s value Μ = Sample: π π΅ βͺ βͺ Geometric mean: = (π₯1 × π₯2 × … × π₯πππ§π is middle value of an ordinal data βͺ is value at position of βͺ βͺ βͺ βͺ βͺ Population: ππ ; Sample: π₯Μ is not affected by extreme value useful to compare data with outlier is most freq. value may have 0, 1 or > 1 mode 2 … … π₯π ππ πΊπππ ∑π΅ π ππ π+1 π₯2 π2 πΊππ ππ ππππππ βͺ Population: π = Central tendency π₯πΎ ππΎ )1/πππ§π ∑π π ππ π (return/increase …over …year) = (π + 1) × 0.5 π βͺ Ordinal data (π π‘β value is quantile level π+1) βͺ βͺ Sort Smallest → Largest Quantile level π½ là ππ½ ; π0 = π₯πππ ; π1 = π₯πππ₯ Position of ππ½ is (π + 1)π½ = {πππ‘, πππ} ππ½ = π₯πππ‘ + πππ × (π₯πππ‘+1 − π₯πππ‘ ) Quantile βͺ βͺ βͺ βͺ βͺ βͺ 3 quartiles divide data into 4 equal parts π1 (πππ€ππ πππ’ππ‘β) = π0.25 π2 (ππππππ) = π0.5 π3 (π’ππππ πππ’ππ‘β) = π0.75 Quartiles 5 key-point: π₯πππ = π0 ; π1 = π0.25 ; π2 = ππππππ = π0.5 ; π3 = π0.75 ; π₯πππ₯ = π1 πΌππ = π3 − π1 ππ’π‘ππππ ∉ (π1 − 1,5 × πΌππ , π3 + 1,5 × πΌππ ) πΈπ₯π‘ππππ ππ’π‘ππππ ∉ (π1 − 3 × πΌππ , π3 + 3 × πΌππ ) Quintiles 4 quintiles divide data into 5 equal parts Deciles 9 deciles divide data into 10 equal parts Percentiles 99 percentiles divide data into 100 equal parts 2 8. Measures of Variability Population Sample Range π = π = π₯πππ₯ − π₯πππ : width of interval cover 100% values Interquartile πΌππ = π3 − π1 : width of interval cover 100% values range Forth spread: ππ π ∑ππ(ππ − π ∑π΅ Μ )π πΊππ πΊπΏπΏ π (ππ − π) ππ = = ππ = = π−π π π΅ π΅ π π΅ π ∑ ππ ∑π π π π π Μ )π ) = = ( − (π ππ = − ππ π−π π π−π π΅ Variance βͺ Absolute variability βͺ unit of variance is squared unit of X βͺ Var(X) > Var(Y) → X is variability, dispersion, fluctuate than Y // Y is more stable, concentrated than X Standard deviation (S.D) π = √π 2 π = √π 2 Same unit with X Absolute variability π π Coefficient πΆπ = × 100(%) πΆπ = × 100(%) π π₯Μ of variation βͺ Unit: % βͺ CV is used to compare the variability of variables with different units βͺ Relative variability 9. Measures of Shape Population Sample π 3 ∑π1(π₯π − π₯Μ )3 /π ∑1 (π₯π − π) /π ππππ€ = ππππ€ = π 3 π3 Skewness Mean < Median 4 ∑π 1 (π₯π − π) /π πΎπ’ππ‘ = π4 Mean = Median Kurtosis 3 Median < Mean ∑π1(π₯π − π₯Μ )4 /π πΎπ’ππ‘ = π 4 π₯ −ππππ π 10. Standardized value π§π = π π‘ππππππ πππ£πππ‘πππ Example 1 Population (2, 2, 3, 3, 3, 4, 4, 4, 4, 5) Sample (2, 3, 2, 4, 5) 4 2 3 2 1 1 1 3 4 5 1 2 3 10 π = 10; ∑ π₯π = 34; Mean Median Mode Quartiles Range Interquartile range Variance 1 10 ∑1 π₯π 4 2 5 10 ∑ π₯π2 1 5 π = 5; ∑ π₯π = 16; ∑ π₯π2 = 58 = 124 34 = 3.4 π 10 π₯5 + π₯6 3 + 4 ππ = = = 3.5 2 2 π= 5 = π₯Μ = 1 5 ∑1 π₯π 1 = 16 = 3.2 5 ππππ = 4 • (π + 1) × 0.25 = 11 × 0.25 = 2.75 π1 = π0.25 = π₯2 + 0.75(π₯2+1 − π₯2 ) = 2 + 0.75(3 − 2) = 2.75 • π2 = π0.5 = ππ = 3.5 • (π + 1) × 0.75 = 11 × 0.75 = 8.25 π3 = π0.75 = π₯8 + 0.25(π₯9 − π₯8 ) = 4 + 0.25(4 − 4) = 4 π = 5−2 =3 πΌππ = ππ = π3 − π1 = 4 − 2.75 = 1.25 π 2; 2; 3; 4; 5 π₯Μ = π₯3 = 3 ππππ = 2 • (π + 1) × 0.25 = 6 × 0.25 = 1.5 π1 = π0.25 = π₯1 + 0.5(π₯2 − π₯1 ) = 2 + 0.5(2 − 2) = 2 • π2 = π0.5 = ππ = 3 • (π + 1) × 0.75 = 6 × 0.75 = 4.5 π3 = π0.75 = π₯4 + 0.5(π₯5 − π₯4 ) = 4 + 0.5(5 − 4) = 4.5 π = 5−2 =3 πΌππ = ππ = π3 − π1 = 4.5 − 3 = 1.5 124 − 3.42 = 0.84 10 π = √0.84 = 0.92 5 58 π 2 = ( − 3.22 ) = 1.7 4 5 π = √1.7 = 1.3 π2 = Standard deviation (S.D) 0.92 1.3 Coefficient πΆπ = × 100(%) = 27,06(%) πΆπ = × 100(%) = 40.6% 3.4 3.2 of variation Standardized (-1.52; -1.52; -0.43; -0.43; -0.43; 0.65; 0.65; (-0.92; -0.92; -0.15; 0.62; 1.38) value 0.65; 0.65; 1.74) −0.72/10 2.88/5 Skewness π πππ€ = = −0.092 π πππ€ = = 0.26 0.923 1.33 14.832/10 15.056/5 Kurtosis ππ’ππ‘ = = 2.07 ππ’ππ‘ = = 1.05 4 0.92 1.34 4 Example 2 Population Group 2-4 π₯π Freq. 2 π₯π Freq. 4-6 5 6-8 7 8 - 10 1 5 5 7 7 9 1 3 2 (3, 3, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 7, 9) πππ = π₯π Freq. 3 1 4-6 3 6-8 2 5 3 7 2 (3, 5, 5, 5, 7, 7) Mean Median Mode Quartiles Range Interquartile range Variance Standard deviation (S.D) CV Standardized value Skewness Kurtosis 11. Measures of Relationship Population ∑(π₯π − ππ )(π¦π − ππ ) Covariance πΆππ£(π, π) = π = πππ − ππ ππ Correlation Sample Group 2-4 π₯π Freq. 1 πΆππ£(π, π) ππ ππ 5 Sample ∑(π₯π − π₯Μ )(π¦π − π¦Μ ) π π (π₯ = Μ Μ Μ Μ Μ Μ ⋅ π¦ − π₯Μ ⋅ π¦Μ ) π−1 πΆππ£(π, π) πππ = π π π π πΆππ£(π, π) = Lecture03 1. Population ≡ Random Variable πΏ Population mean π = πΈ(π) Population variance π 2 = π(π) 2. Sample: random & observed πΏ = (π1 , π2 , … , ππ ): ππππππ π πππππ ⇔ { π1 , π2 , … , ππ πππ ππππππππ‘ππππ‘ (ππ πππ πππ. ) π1 , π2 , … , ππ πππ πππππ‘ππππππ¦ π ππ π‘ππππ’π‘πππ π€ππ‘β π Observed sample (π₯1 , π₯2 , … , π₯π ) 3. Statistic is a function on random sample: πΊ = πΊ(π1 , π2 , … , ππ ) Observed sample → π = πΊπ π‘ππ‘ = πΊ(π₯1 , π₯2 , … , π₯π ): observed value Ex: (π1 , π2 , … , π10 ): random sample; πΊ = π1 +π2 +π3 → πΈ(πΊ) = 3 8+7+4 (8; 7; 4; 9; 10; 2; 6; 3; 5; 6) → π = πΊπ π‘ππ‘ = 3 πΈ(π1 )+πΈ(π2 )+πΈ(π3 ) 3 = π; π(πΊ) = π(π1 )+π(π2 )+π(π3 ) 32 = π2 3 = 6.3333 4. Statistic Sample mean πΜ = ∑π1 ππ π Obseved (lec02) ∑π1 π₯π π₯Μ = π π ∑1 π₯π ππ π₯Μ = π Expectation πΈ(πΜ ) = π Variance π(πΜ ) = • π2 π Known π 2 : π > 30: ∑ππ(ππ − π)2 π ∑ππ(π₯π − π)2 π π2 • 2π 4 π 6 • πΏ ∼ π΅(π, ππ ) π2 → πΜ ∼ π (π, ) π Unknown πΊππ Interval for sample … Related distribution πΜ −π ∼ π(0,1) π/√π πΜ −π π 2 : π/ π ∼ √ πΜ −π π(π − 1) ∼ π(0,1) π/√π π ), πΏ ∼ π΅(π, π known π πππ2 ∼ π 2 (π) π2 πΜ − π ∼ π(π) ππ /√π • Two-tailed π π π − π§πΌ < πΜ < π + π§πΌ 2 √π 2 √π π Right-tailed: π − π§πΌ π < πΜ • Left-tailed: πΜ < π + π§πΌ √ π √π π΄πΊ Sample variance Sample proportion ∑π1 ππ2 − (πΜ )2 π = Μ Μ Μ Μ π 2 − (πΜ )2 π π2 = ππ π−1 π ∼ π΅(1, π) πΜ = πΜ ∑π1 π₯π2 − (π₯Μ )2 π = Μ Μ Μ π₯ 2 − (π₯Μ )2 π−1 2 π π πΈ(π 2 ) = π 2 πΈ(πΜ ) = π 2(π − 1) 4 πΏ ∼ π΅(π, ππ ); unknown π, ππ π πππ π2 ∼ π 2 (π − 1) π2 2π 4 πΏ ∼ π΅(π, ππ ); unknown π, ππ (π − 1)π 2 π−1 ∼ π 2 (π − 1) π2 π(1 − π) π π ≥ πππ: πΜ ∼ π (π, • Two-tailed π 2 2(π−1) π 2 2(π−1) 2 π <π < π π − 1 1−πΌ/2 π − 1 πΌ/2 • 2 Right-tailed: π−1 π(π−1)1−πΌ < π2 • Left-tailed: • Two-tailed π − π§πΌ ππΜ < πΜ < π + π§πΌ ππΜ π(1 − π) ) π π2 2 ππΜ = √ 7 π2 2 π 2 < π−1 π(π−1)πΌ 2 π(1 − π) π π(1−π) • Right-tailed: π − π§πΌ √ • Left-tailed: πΜ < π + π§πΌ √ π < πΜ π(1−π) π Lecture04 • Estimator (example: πΜ , π 2 ) is random statistic on random sample, is a random variable • Estimate (example: π₯Μ , π 2 ) is observed value of statistic from observed sample • Criteria for estimator o πΜ is unbiased estimator of π ⇔ πΈ(πΜ) = π ⇔ ππππ = |πΈ(πΜ) − π| = 0 πΈ(πΜ1 ) = πΈ(πΜ2 ) = π o { ⇒ πΜ1 is more efficient than πΜ2 π(πΜ1 ) < π(πΜ2 ) πΈ(πΜ ) = π o { ⇒ πΜ is efficient estimator π(πΜ) is minimum among every unbiased estimator MVUE: minimum variance unbiased estimator > BUE: best unbiased estimator o Consistent estimator • Find method: o Percentile matching estimator Using: when parameter could be calculated from percentile / quantile From the population distribution, find the quantile formulas that are expression of parameters Estimate population quantiles by sample quantiles → estimate parameters π2 , π1 (ππ π3 ), … o Moment estimator Using: when parameter could be calculated from moments Estimate k parameters by first k moments: estimate πΈ(π) by πΜ , estimate πΈ(π 2 ) by Μ Μ Μ Μ π2, … Estimate moment → estimate parameter o Maximum likelihood estimator Likelihood function Random variable π with parameter π, random sample πΏ = (π1 , π2 , … , ππ ), then likelihood function is: π ∏ π(ππ , π) βΆ πππ ππππ‘π π=1 π πΏ(πΏ, π) = ∏ π(ππ , π) βΆ ππππ‘πππ’ππ’π { π=1 Maximum likelihood estimator (MLE) MLE of π is πΜ that maximize Likelihood function or logarithm of Likelihood function πΏ(πΏ, π) → πππ₯ or ln (πΏ(πΏ, π) → πππ₯ ππΏ(πΏ, π) = 0 ⇔ π = πΜ ππ πΏ(πΏ, π) → πππ₯ ⇔ π 2 πΏ(πΏ, π) | <0 2 Μ π=π { ππ π ln(πΏ(πΏ, π)) = 0 ⇔ π = πΜ ππ ln (πΏ(πΏ, π) → πππ₯ ⇔ π 2 ln(πΏ(πΏ, π)) | <0 2 ππ Μ { π=π 8 • Fisher information: 1 π o Bernoulli distribution: πΜ is MVUE of p ⇒ πΌπ (π) = π(πΜ) = π(1−π) 1 π o Normality distribution: πΜ is MVUE of π ⇒ πΌπ (π) = π(πΜ ) = π2 o Normality distribution: π 2 is not MVUE of π 2 ⇒ πΌπ (π 2 ) =? > 9 π−1 2π2 Lecture05 • Confidence interval (C.I) = Interval estimate • Prediction interval (P.I): interval for single random observation with prediction level (1 − πΌ) • π(πΏππ€ππ πΏππππ‘ < π < πππππ πΏππππ‘) = 1 − πΌ 1. Mean a. Normality distribution – known ππ π Two-sided C.I: πΜ − π§πΌ/2 × < π < πΜ + π§πΌ/2 × Confidence level = 1−πΌ Confidence width=π€=ππΏ−πΏπΏ π → Shorten: πΜ ± ππΈ π π → π€ = 2π§πΌ/2 × → ππππππ ππ πππππ: ππΈ = π§πΌ/2 × √π √π π b. Normality distribution – unknown π π π • Two-sided C.I: πΜ − π‘(π−1)πΌ/2 × π < π < πΜ + π‘(π−1)πΌ/2 × π → Shorten: πΜ ± ππΈ √π √π √ → π€ = 2π‘(π−1)πΌ/2 × π √ → ππππππ ππ πππππ: ππΈ = π‘(π−1)πΌ/2 × • √π Right-sided C.I: πΜ − π‘(π−1)πΌ × • Left-sided C.I: • P.I: π √π π √π <π π < πΜ + π‘(π−1)πΌ × π √π 1 πΜ ± π‘(π−1)πΌ/2 × π × √1 + π 2. Variance: Normality distribution – unknown π • Two-sided C.I: • Right-sided C.I: • Left-sided C.I: (π−1)π 2 2 π(π−1)πΌ/2 (π−1)π 2 2 π(π−1)πΌ (π−1)π 2 < π 2 < π2 (π−1)1−πΌ/2 < π2 (π−1)π 2 π 2 < π2 (π−1)1−πΌ 3. Proportion a. π ≥ πππ • πΜ(1−πΜ) Two-sided C.I: πΜ − π§πΌ × √ 2 → π€ = 2π§πΌ/2 × √ π πΜ(1−πΜ) < π < πΜ + π§πΌ/2 × √ π → Shorten: πΜ ± ππΈ πΜ (1 − πΜ ) πΜ (1 − πΜ ) → ππππππ ππ πππππ: ππΈ = π§πΌ/2 × √ π π πΜ(1−πΜ) • Right-sided C.I: πΜ − π§πΌ × √ • Left-sided C.I: π <π πΜ(1−πΜ) π < πΜ + π§πΌ × √ b. π < πππ Two-sided C.I: 10 π Lecture06 1. Test for Normal mean – known π Hypethesis pair Statistic π» : π = π0 • πΜ → πΜ π π‘ππ‘ = π₯Μ { 0 π»1 : π > π0 πΜ −π • π = π/ π0 → ππ π‘ππ‘ π»0 : π ≤ π0 √ ππ { π»1 : π > π0 (π»0 true → π ∼ π(0; 1)) π»0 : π = π0 π»1 : π < π0 π» : π ≥ π0 ππ { 0 π»1 : π < π0 { { π»0 : π = π0 π»1 : π ≠ π0 π· = P(Error Type 2); π = ππ π π = π1 > π0 ; π = π0 + π§πΌ √π π0 − π1 π2 π½ = π [πΜ ≤ π|πΜ ∼ π (π1 , π )] = π [π ≤ π§πΌ + ] π/√π π π(π < ππ π‘ππ‘ ) π = π1 < π0 ; π = π0 − π§πΌ √π = π(π > −ππ π‘ππ‘ ) π0 − π1 π2 π½ = π [πΜ ≥ π|πΜ ∼ π (π1 , π )] = π [π ≥ −π§πΌ + ] π/√π Reject region π π₯Μ > π0 + π§πΌ √π ⇔ ππ π‘ππ‘ > π§πΌ π₯Μ < π0 − π§πΌ ⇔ ππ π‘ππ‘ P-value π(π > ππ π‘ππ‘ ) π √π < −π§πΌ |π₯Μ − π0 | > π§πΌ/2 π √π 2π(π > |ππ π‘ππ‘ |) π½ = π[πΜ ≤ π1|π = π1] + π[πΜ ≥ π2 |π = π1 ] ⇔ |ππ π‘ππ‘ | > π§πΌ/2 π π₯Μ > π0 + π§πΌ/2 = π1 √π |π₯Μ − π0 | > π§πΌ/2 ⇔ [ π √π π₯Μ < π0 − π§πΌ = π2 2 √π π 2. Test for Normal mean – unknown π Hypetheses Statistic pair π» : π = π0 • πΜ → πΜ π π‘ππ‘ = π₯Μ { 0 π»1 : π > π0 πΜ −π • π = π/ π0 → ππ π‘ππ‘ π»0 : π ≤ π0 √ ππ { π»1 : π > π0 (π»0 true → π ∼ π(π − 1)) π»0 : π = π0 π»1 : π < π0 π» : π ≥ π0 ππ { 0 π»1 : π < π0 π» : π = π0 { 0 π»1 : π ≠ π0 { Reject region ππ π‘ππ‘ > π‘(π−1)πΌ P-value P(ET.2); π = ππ π π[π(π − 1) > ππ π‘ππ‘ ] π = π1 > π0 ; π = π0 + π‘(π−1)πΌ π > 30 ⇒ π(π − 1) ≈ π(0,1) √π 2 π π < 30 → πΈπ₯πππ, π π½ = π [πΜ < π|πΜ ∼ π (π1 , )] π ππ π‘ππ‘ < −π‘(π−1)πΌ π[π(π − 1) < ππ π‘ππ‘ ] = π[π(π − 1) > −ππ π‘ππ‘ ] π½ =? |ππ π‘ππ‘ | > π‘(π−1)πΌ/2 2 × π[π(π − 1) > |ππ π‘ππ‘ |] π½ =? 11 3. Test for Normal variance Hypetheses pair π» : π = π02 { 0 2 π»1 : π > π02 π» : π2 ≤ ππ { 0 2 π»1 : π > π» : π 2 = π02 { 0 2 π»1 : π < π02 π»0 : π 2 ≥ ππ { π»1 : π 2 < π» : π 2 = π02 { 0 2 π»1 : π ≠ π02 2 Statistic • π02 π02 π2 = Reject region (π−1)π 2 π02 2 2 → ππ π‘ππ‘ 2 ππ π‘ππ‘ > 2 π(π−1)πΌ (π»0 true → π ∼ π 2 (π − 1)) P-value 2 π[π 2 (π − 1) > ππ π‘ππ‘ ] πΈπ₯πππ, π 2 2 ππ π‘ππ‘ < π(π−1)1−πΌ 2 ] π[π 2 (π − 1) < ππ π‘ππ‘ |ππ π‘ππ‘ | > π‘(π−1)πΌ/2 2 πΌπ π 2 > π02 → π − π£πππ’π = 2 × π[π 2 (π − 1) > ππ π‘ππ‘ ] 2 2 2 2 πΌπ π < π0 → π − π£πππ’π = 2 × π[π (π − 1) < ππ π‘ππ‘ ] π02 π02 12 4. Test for population proprtion, π ≥ πππ (large sample) (slide157) Hypethesis pair π» : π = π0 { 0 π»1 : π > π0 π» : π ≤ π0 ππ { 0 π»1 : π > π0 Statistic • • Reject region πΜ → πΜπ π‘ππ‘ π= πΜ−π0 √π0 (1−π0 )/√π → ππ π‘ππ‘ √π0 (1 − π0 ) π»0 : π = π0 π»1 : π < π0 π» : π ≥ π0 ππ { 0 π»1 : π < π0 πΜ < π0 − π§πΌ π = π0 + π§πΌ π(π < ππ π‘ππ‘ ) = π(π > −ππ π‘ππ‘ ) π = π1 > π0 ; π = π0 − π§πΌ √π √π0 (1 − π0 ) |πΜ − π0 | > π§πΌ/2 √π 2π(π > |ππ π‘ππ‘ |) π½ =? ⇔ |ππ π‘ππ‘ | > π§πΌ/2 5. Test for population proprtion, small sample Hyp. pair π» : π = π0 π» : π ≤ π0 { 0 ππ { 0 π»1 : π > π0 π»1 : π > π0 { π»0 : π = π0 π» : π ≥ π0 ππ { 0 π»1 : π < π0 π»1 : π < π0 { π»0 : π = π0 π»1 : π ≠ π0 Stat. ππππ. Reject H0 ππππ. ≥ ππππ‘. π(π ≥ ππππ‘. |π = π0) < πΌ ππππ. ≤ ππππ‘. π(π ≤ ππππ‘. |π = π0) < πΌ ππππ. ≥ ππππ‘1 ππ ππππ. ≤ ππππ‘2 π(π ≥ ππππ‘1 |π = π0) < πΌ/2 π(π ≤ ππππ‘1 |π = π0) < πΌ/2 13 √π0 (1 − π0 ) √π π (1 − π ) π½ = π [πΜ ≥ π|πΜ ∼ π (π1 , 1 π 1 )] √π0 (1 − π0 ) √π √π0 (1 − π0 ) √π π (1 − π ) π½ = π [πΜ ≤ π|πΜ ∼ π (π1 , 1 π 1 )] ⇔ ππ π‘ππ‘ < −π§πΌ π»0 : π = π0 π»1 : π ≠ π0 π· = P(Error Type 2); π = ππ π = π1 > π0 ; ⇔ ππ π‘ππ‘ > π§πΌ (π»0 true → π ∼ π(0; 1)) { { πΜ > π0 + π§πΌ P-value π(π > ππ π‘ππ‘ ) P-value π(π ≥ ππππ.π π‘ππ‘ ) π(π ≤ ππππ.π π‘ππ‘ ) Binomal π©(π, π = ππ ) x P(X = x) 0 πΆπ0 π00 (1 − π0 )π 1 πΆπ1 π01 (1 − π0 )π−1 … … π πΆππ π0π (1 − π0 )π−π … … π−1 πΆππ−1 π0π−1 (1 − π0 )1 π πΆππ π0π (1 − π0 )0 Lecture07 Inference for 2 means π1 ∼ π(π1 , π12 ), π2 ∼ π(π2 , π22 ) π―π : π π = π π true Pair sample? π = πΏπ − πΏπ ; ππππ Μ , ππ Sample: π, π false Hyp. pair Statistic Rejection region Μ π»0 : ππ = 0 ππ π‘ππ‘ > π‘(π−1)πΌ π−0 { ππ π‘ππ‘ = π»1 : ππ > 0 π π /√π π» :π = 0 ππ π‘ππ‘ < −π‘(π−1)πΌ { 0 π π»1 : ππ < 0 π» :π = 0 |ππ π‘ππ‘ | > π‘(π−1)πΌ/2 { 0 π π»1 : ππ ≠ 0 π π»0 is false → C.I for π1 − π2 : πΜ ± π‘(π−1)πΌ/2 ππ π − π£πππ’π π[π(π − 1) > ππ π‘ππ‘ ] π[π(π − 1) < ππ π‘ππ‘ ] 2 × π[π(π − 1) > |ππ π‘ππ‘ |] √ Known πππ , πππ ? true Hyp. pair π»0 : π1 = π2 { π»1 : π1 > π2 π»1 : π1 < π2 π»1 : π1 ≠ π2 Test Stat. Μ Μ Μ π1 − Μ Μ Μ π2 π= π2 √ 1 π1 + ∼ π(0,1) π22 π2 Reject Region ππ π‘ππ‘ > π§πΌ P-value π(π > ππ π‘ππ‘ ) ππ π‘ππ‘ < −π§πΌ |ππ π‘ππ‘ | > π§πΌ/2 π(π < ππ π‘ππ‘ ) 2 × π(π > |ππ π‘ππ‘ |) π»0 is false → C.I for π1 − π2 : … false Inference for 2 variances Hyp. pair π» : π 2 = π22 { 0 12 π»1 : π1 ≠ π22 π» : π 2 = π22 { 0 12 π»1 : π1 > π22 π» : π 2 = π22 { 0 12 π»1 : π1 < π22 Stat. πΉπ π‘ππ‘ π»0 is false → C.I for π12 = 2 π2 πππ = πππ ? Reject Region πΉπ π‘ππ‘ > π(π1 −1,π2 −1)πΌ/2 or πΉπ π‘ππ‘ < π(π1 −1,π2 −1)1−πΌ/2 πΉπ π‘ππ‘ > π(π1 −1,π2 −1)πΌ true false Hyp. pair π» : π = π2 { 0 1 π»1 : π1 > π2 π»1 : π1 < π2 π»1 : π1 ≠ π2 Reject Region π= P Μ Μ Μ π1 − Μ Μ Μ π2 ππ π‘ππ‘ > π‘(π1 +π2 −2)πΌ π(π > ππ π‘ππ‘ ) π2 π2 √ π+ π π1 π2 ππ π‘ππ‘ < −π‘(π1 +π2 −2)πΌ |ππ π‘ππ‘ | > π‘(π1 +π2 −2)πΌ/2 π(π < ππ π‘ππ‘ ) 2 × π(π > |ππ π‘ππ‘ |) ∼ π(π1 + π2 − 2) (π1 − + (π2 − 1)π22 ππ2 = π1 + π2 − 2 1)π12 Hyp. pair π» : π = π2 { 0 1 π»1 : π1 > π2 π»1 : π1 < π2 π»1 : π1 ≠ π2 πΉπ π‘ππ‘ < π(π1 −1,π2 −1)1−πΌ π12 : π22 Stat. … Stat. π= Μ Μ Μ π1 − Μ Μ Μ π2 π2 π2 √ 1+ 2 π1 π2 ∼ π(ππ) ππ = 14 (π12 /π1 + π22 /π2 )2 (π12 /π1 )2 (π22 /π2 )2 π1 − 1 + π2 − 1 Reject region ππ π‘ππ‘ > π‘(ππ)πΌ P-value π(π > ππ π‘ππ‘ ) ππ π‘ππ‘ < −π‘(ππ)πΌ |ππ π‘ππ‘ | > π‘(ππ)πΌ/2 π(π < ππ π‘ππ‘ ) 2 × π(π > |ππ π‘ππ‘ |) Inference for 2 proportions Hyp. pair π» : π = π2 { 0 1 π»1 : π1 > π2 π»1 : π1 < π2 π»1 : π1 ≠ π2 Stat. πΜ1 − πΜ 2 π= 1 1 √πΜ (1 − πΜ ) ( + ) π1 π2 π1 πΜ1 + π1 πΜ2 πΜ = π1 + π2 πΜ1 (1−πΜ1 ) C.I: π1 − π2 ∈ (πΜ1 − πΜ 2 ) ± π§πΌ √ 2 π1 + ∼ π(0,1) Reject Region ππ π‘ππ‘ > π§πΌ P-value π(π > ππ π‘ππ‘ ) ππ π‘ππ‘ < −π§πΌ |ππ π‘ππ‘ | > π§πΌ/2 π(π < ππ π‘ππ‘ ) 2 × π(π > |ππ π‘ππ‘ |) πΜ2 (1−πΜ2 ) π2 Correlation test (formular and table) 15 Lecture 08: ANOVA: to test for equality of means 1. One - factor ANOVA 16 2. Two - factor ANOVA without interaction 17 3. Two - factor ANOVA with interaction 18 Lecture 09 1. Chi-squared test 2. Independent test (easy and common) 19 Can be proved: 20 2 ππ π‘ππ‘ = π ( ∑π ∑π 2 πΉππ π π πΆπ − 1) 3. Rank test Critical value 21 4. Nomality test Jarque-Bera test (easy and common) 22