Statistics Lecture Notes: Descriptive & Inferential Analysis

Lecture 01, 02 1. Statistics Descriptive statistics orginize summarize } data?collect?n,N? present • The average number of students in a class at White Oak University is 22.6 • Last year’s total attendance at Long Run High School’s football games was 8235 2. Types of Variable Qualitative (Categorical) Nominal Ordinal Incompareable Compareable Cannot calculated Name, address, career,.. Rank, Size, … Binary: 2 values; male/female; yes/no; … 3. Level of measurement Qualitative (Categorical) Nominal Ordinal Names/label Nominal + rank/order Inferential statistics predict forecast} uncertain verify • A recent study showed that eating garlic can lower blood pressure • It is predicted that the average number of automobiles each household owns will increase next year Quantitative (Scale) Discrete Countable Calculate: +, Score, The number of … Continuous Uncountable Calculate: +, -, x, :, … Height, Temperature, … Quantitative (Scale) Interval: Ordinal + interval between 2 steps is equal Score, Temperatue, … Ratio: Interval + ratio of 2 values is meaningful Height, Time, … 4. Tabular Qualitative (Categorical): nominal / ordinal • Frequency distribution tabular • Relative freq. dis. • Percent freq. dis. (%) • Cross-tabular / row and column … Quantitative (Scale) • Frequency distribution tabular • Relative freq. dis. / Percent freq. dis. (%) • Cummulative freq. dis. • Cummulative relative freq. dis. • Cross-tabular / row and column … 5. Graphics Qualitative (Categorical) • Pie chart • Bar/column chart • Clustered bar/column chart • Stacked bar/column chart Quantitative (Scale) • Bar/column chart • Clustered bar/column chart • Stacked bar/column chart • Group chart • Histogram / Histogram and cummulative • Dot Plot; Line; … • Scatter Plot (→ Correlation) • Bubble chart 1 6. Data Population (𝑥1 , 𝑥2 , … , 𝑥𝑁 ) Listed Frequency tabular Value Freq. Grouped data Sample (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) 𝑥1 𝑓1 𝑥2 𝑓2 𝑎0 − 𝑎1 𝑎0 + 𝑎1 2 Freq. 𝑓1 7. Measures of Locations Value 𝑥𝑖 Mean … … 𝑎1 − 𝑎2 𝑎1 + 𝑎2 2 𝑓2 Median Mode Value Freq. 𝑎𝐾−1 − 𝑎𝐾 𝑎𝐾−1 + 𝑎𝐾 2 𝑓𝐾 Value 𝑎0 − 𝑎1 𝑎1 − 𝑎2 … 𝑎𝐾−1 − 𝑎𝑘 𝑎0 + 𝑎1 𝑎1 + 𝑎2 … 𝑎𝐾−1 + 𝑎𝑘 𝑥𝑖 2 2 2 Freq. 𝑓1 𝑓2 … 𝑓𝑘 … … … 𝑥1 𝑓1 Arithmetic mean: 𝑴𝒆𝒂𝒏 = ▪ ▪ Same unit with X “Sensitive” to any change in element’s value ̅= Sample: 𝒙 𝑵 ▪ ▪ Geometric mean: = (𝑥1 × 𝑥2 × … × 𝑥𝑆𝑖𝑧𝑒 is middle value of an ordinal data ▪ is value at position of ▪ ▪ ▪ ▪ ▪ Population: 𝑚𝑒 ; Sample: 𝑥̃ is not affected by extreme value useful to compare data with outlier is most freq. value may have 0, 1 or > 1 mode 2 … … 𝑥𝑘 𝑓𝑘 𝑺𝒊𝒛𝒆 ∑𝑵 𝒊 𝒙𝒊 𝑛+1 𝑥2 𝑓2 𝑺𝒖𝒎 𝒐𝒇 𝒗𝒂𝒍𝒖𝒆𝒔 ▪ Population: 𝝁 = Central tendency 𝑥𝐾 𝑓𝐾 )1/𝑆𝑖𝑧𝑒 ∑𝒏 𝒊 𝒙𝒊 𝒏 (return/increase …over …year) = (𝑛 + 1) × 0.5 𝑘 ▪ Ordinal data (𝑘 𝑡ℎ value is quantile level 𝑛+1) ▪ ▪ Sort Smallest → Largest Quantile level 𝛽 là 𝑞𝛽 ; 𝑞0 = 𝑥𝑚𝑖𝑛 ; 𝑞1 = 𝑥𝑚𝑎𝑥 Position of 𝑞𝛽 is (𝑛 + 1)𝛽 = {𝑖𝑛𝑡, 𝑑𝑒𝑐} 𝑞𝛽 = 𝑥𝑖𝑛𝑡 + 𝑑𝑒𝑐 × (𝑥𝑖𝑛𝑡+1 − 𝑥𝑖𝑛𝑡 ) Quantile ▪ ▪ ▪ ▪ ▪ ▪ 3 quartiles divide data into 4 equal parts 𝑄1 (𝑙𝑜𝑤𝑒𝑟 𝑓𝑜𝑢𝑟𝑡ℎ) = 𝑞0.25 𝑄2 (𝑚𝑒𝑑𝑖𝑎𝑛) = 𝑞0.5 𝑄3 (𝑢𝑝𝑝𝑒𝑟 𝑓𝑜𝑢𝑟𝑡ℎ) = 𝑞0.75 Quartiles 5 key-point: 𝑥𝑚𝑖𝑛 = 𝑞0 ; 𝑄1 = 𝑞0.25 ; 𝑄2 = 𝑚𝑒𝑑𝑖𝑎𝑛 = 𝑞0.5 ; 𝑄3 = 𝑞0.75 ; 𝑥𝑚𝑎𝑥 = 𝑞1 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 𝑂𝑢𝑡𝑙𝑖𝑒𝑟 ∉ (𝑄1 − 1,5 × 𝐼𝑄𝑅, 𝑄3 + 1,5 × 𝐼𝑄𝑅) 𝐸𝑥𝑡𝑟𝑒𝑚𝑒 𝑜𝑢𝑡𝑙𝑖𝑒𝑟 ∉ (𝑄1 − 3 × 𝐼𝑄𝑅, 𝑄3 + 3 × 𝐼𝑄𝑅) Quintiles 4 quintiles divide data into 5 equal parts Deciles 9 deciles divide data into 10 equal parts Percentiles 99 percentiles divide data into 100 equal parts 2 8. Measures of Variability Population Sample Range 𝑅 = 𝑊 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛 : width of interval cover 100% values Interquartile 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 : width of interval cover 100% values range Forth spread: 𝑓𝑠 𝟐 ∑𝒏𝟏(𝒙𝒊 − 𝒙 ∑𝑵 ̅)𝟐 𝑺𝒙𝒙 𝑺𝑿𝑿 𝟏 (𝒙𝒊 − 𝝁) 𝒔𝟐 = = 𝝈𝟐 = = 𝒏−𝟏 𝒏 𝑵 𝑵 𝟐 𝑵 𝟐 ∑ 𝒙𝒊 ∑𝟏 𝒙 𝒊 𝒏 𝒏 ̅)𝟐 ) = = ( − (𝒙 𝒎𝒔 = − 𝝁𝟐 𝒏−𝟏 𝒏 𝒏−𝟏 𝑵 Variance ▪ Absolute variability ▪ unit of variance is squared unit of X ▪ Var(X) > Var(Y) → X is variability, dispersion, fluctuate than Y // Y is more stable, concentrated than X Standard deviation (S.D) 𝜎 = √𝜎 2 𝑠 = √𝑠 2 Same unit with X Absolute variability 𝜎 𝑠 Coefficient 𝐶𝑉 = × 100(%) 𝐶𝑉 = × 100(%) 𝜇 𝑥̅ of variation ▪ Unit: % ▪ CV is used to compare the variability of variables with different units ▪ Relative variability 9. Measures of Shape Population Sample 𝑁 3 ∑𝑛1(𝑥𝑖 − 𝑥̅ )3 /𝑛 ∑1 (𝑥𝑖 − 𝜇) /𝑁 𝑆𝑘𝑒𝑤 = 𝑆𝑘𝑒𝑤 = 𝑠3 𝜎3 Skewness Mean < Median 4 ∑𝑁 1 (𝑥𝑖 − 𝜇) /𝑁 𝐾𝑢𝑟𝑡 = 𝜎4 Mean = Median Kurtosis 3 Median < Mean ∑𝑛1(𝑥𝑖 − 𝑥̅ )4 /𝑛 𝐾𝑢𝑟𝑡 = 𝑠4 𝑥 −𝑚𝑒𝑎𝑛 𝑖 10. Standardized value 𝑧𝑖 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 Example 1 Population (2, 2, 3, 3, 3, 4, 4, 4, 4, 5) Sample (2, 3, 2, 4, 5) 4 2 3 2 1 1 1 3 4 5 1 2 3 10 𝑁 = 10; ∑ 𝑥𝑖 = 34; Mean Median Mode Quartiles Range Interquartile range Variance 1 10 ∑1 𝑥𝑖 4 2 5 10 ∑ 𝑥𝑖2 1 5 𝑛 = 5; ∑ 𝑥𝑖 = 16; ∑ 𝑥𝑖2 = 58 = 124 34 = 3.4 𝑁 10 𝑥5 + 𝑥6 3 + 4 𝑚𝑒 = = = 3.5 2 2 𝜇= 5 = 𝑥̅ = 1 5 ∑1 𝑥𝑖 1 = 16 = 3.2 5 𝑀𝑜𝑑𝑒 = 4 • (𝑁 + 1) × 0.25 = 11 × 0.25 = 2.75 𝑄1 = 𝑞0.25 = 𝑥2 + 0.75(𝑥2+1 − 𝑥2 ) = 2 + 0.75(3 − 2) = 2.75 • 𝑄2 = 𝑞0.5 = 𝑚𝑒 = 3.5 • (𝑁 + 1) × 0.75 = 11 × 0.75 = 8.25 𝑄3 = 𝑞0.75 = 𝑥8 + 0.25(𝑥9 − 𝑥8 ) = 4 + 0.25(4 − 4) = 4 𝑅 = 5−2 =3 𝐼𝑄𝑅 = 𝑓𝑠 = 𝑄3 − 𝑄1 = 4 − 2.75 = 1.25 𝑛 2; 2; 3; 4; 5 𝑥̃ = 𝑥3 = 3 𝑀𝑜𝑑𝑒 = 2 • (𝑛 + 1) × 0.25 = 6 × 0.25 = 1.5 𝑄1 = 𝑞0.25 = 𝑥1 + 0.5(𝑥2 − 𝑥1 ) = 2 + 0.5(2 − 2) = 2 • 𝑄2 = 𝑞0.5 = 𝑚𝑒 = 3 • (𝑛 + 1) × 0.75 = 6 × 0.75 = 4.5 𝑄3 = 𝑞0.75 = 𝑥4 + 0.5(𝑥5 − 𝑥4 ) = 4 + 0.5(5 − 4) = 4.5 𝑅 = 5−2 =3 𝐼𝑄𝑅 = 𝑓𝑠 = 𝑄3 − 𝑄1 = 4.5 − 3 = 1.5 124 − 3.42 = 0.84 10 𝜎 = √0.84 = 0.92 5 58 𝑠 2 = ( − 3.22 ) = 1.7 4 5 𝑠 = √1.7 = 1.3 𝜎2 = Standard deviation (S.D) 0.92 1.3 Coefficient 𝐶𝑉 = × 100(%) = 27,06(%) 𝐶𝑉 = × 100(%) = 40.6% 3.4 3.2 of variation Standardized (-1.52; -1.52; -0.43; -0.43; -0.43; 0.65; 0.65; (-0.92; -0.92; -0.15; 0.62; 1.38) value 0.65; 0.65; 1.74) −0.72/10 2.88/5 Skewness 𝑠𝑘𝑒𝑤 = = −0.092 𝑠𝑘𝑒𝑤 = = 0.26 0.923 1.33 14.832/10 15.056/5 Kurtosis 𝑘𝑢𝑟𝑡 = = 2.07 𝑘𝑢𝑟𝑡 = = 1.05 4 0.92 1.34 4 Example 2 Population Group 2-4 𝑥𝑖 Freq. 2 𝑥𝑖 Freq. 4-6 5 6-8 7 8 - 10 1 5 5 7 7 9 1 3 2 (3, 3, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 7, 9) 𝜌𝑋𝑌 = 𝑥𝑖 Freq. 3 1 4-6 3 6-8 2 5 3 7 2 (3, 5, 5, 5, 7, 7) Mean Median Mode Quartiles Range Interquartile range Variance Standard deviation (S.D) CV Standardized value Skewness Kurtosis 11. Measures of Relationship Population ∑(𝑥𝑖 − 𝜇𝑋 )(𝑦𝑖 − 𝜇𝑌 ) Covariance 𝐶𝑜𝑣(𝑋, 𝑌) = 𝑁 = 𝜇𝑋𝑌 − 𝜇𝑋 𝜇𝑌 Correlation Sample Group 2-4 𝑥𝑖 Freq. 1 𝐶𝑜𝑣(𝑋, 𝑌) 𝜎𝑋 𝜎𝑌 5 Sample ∑(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅) 𝑁 𝑛 (𝑥 = ̅̅̅̅̅̅ ⋅ 𝑦 − 𝑥̅ ⋅ 𝑦̅) 𝑛−1 𝐶𝑜𝑣(𝑋, 𝑌) 𝑟𝑋𝑌 = 𝑠𝑋 𝑠𝑌 𝐶𝑜𝑣(𝑋, 𝑌) = Lecture03 1. Population ≡ Random Variable 𝑿 Population mean 𝜇 = 𝐸(𝑋) Population variance 𝜎 2 = 𝑉(𝑋) 2. Sample: random & observed 𝑿 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ): 𝑟𝑎𝑛𝑑𝑜𝑚 𝑠𝑎𝑚𝑝𝑙𝑒 ⇔ { 𝑋1 , 𝑋2 , … , 𝑋𝑛 𝑎𝑟𝑒 𝒊𝑛𝑑𝑒𝑝𝑒𝑛𝑡𝑑𝑒𝑛𝑡 (𝑋𝑖 𝑎𝑟𝑒 𝑖𝑖𝑑. ) 𝑋1 , 𝑋2 , … , 𝑋𝑛 𝑎𝑟𝑒 𝒊𝑑𝑒𝑛𝑡𝑖𝑐𝑎𝑙𝑙𝑦 𝒅𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑋 Observed sample (𝑥1 , 𝑥2 , … , 𝑥𝑛 ) 3. Statistic is a function on random sample: 𝐺 = 𝐺(𝑋1 , 𝑋2 , … , 𝑋𝑛 ) Observed sample → 𝑔 = 𝐺𝑠𝑡𝑎𝑡 = 𝐺(𝑥1 , 𝑥2 , … , 𝑥𝑛 ): observed value Ex: (𝑋1 , 𝑋2 , … , 𝑋10 ): random sample; 𝐺 = 𝑋1 +𝑋2 +𝑋3 → 𝐸(𝐺) = 3 8+7+4 (8; 7; 4; 9; 10; 2; 6; 3; 5; 6) → 𝑔 = 𝐺𝑠𝑡𝑎𝑡 = 3 𝐸(𝑋1 )+𝐸(𝑋2 )+𝐸(𝑋3 ) 3 = 𝜇; 𝑉(𝐺) = 𝑉(𝑋1 )+𝑉(𝑋2 )+𝑉(𝑋3 ) 32 = 𝜎2 3 = 6.3333 4. Statistic Sample mean 𝑋̅ = ∑𝑛1 𝑋𝑖 𝑛 Obseved (lec02) ∑𝑛1 𝑥𝑖 𝑥̅ = 𝑛 𝑛 ∑1 𝑥𝑖 𝑛𝑖 𝑥̅ = 𝑛 Expectation 𝐸(𝑋̅) = 𝜇 Variance 𝑉(𝑋̅) = • 𝜎2 𝑛 Known 𝜎 2 : 𝑛 > 30: ∑𝑛𝑖(𝑋𝑖 − 𝜇)2 𝑛 ∑𝑛𝑖(𝑥𝑖 − 𝜇)2 𝑛 𝜎2 • 2𝜎 4 𝑛 6 • 𝑿 ∼ 𝑵(𝝁, 𝝈𝟐 ) 𝜎2 → 𝑋̅ ∼ 𝑁 (𝜇, ) 𝑛 Unknown 𝑺𝟐𝝁 Interval for sample … Related distribution 𝑋̅ −𝜇 ∼ 𝑁(0,1) 𝜎/√𝑛 𝑋̅ −𝜇 𝜎 2 : 𝑆/ 𝑛 ∼ √ 𝑋̅ −𝜇 𝑇(𝑛 − 1) ∼ 𝑁(0,1) 𝑆/√𝑛 𝟐 ), 𝑿 ∼ 𝑵(𝝁, 𝝈 known 𝝁 𝑛𝑆𝜇2 ∼ 𝜒 2 (𝑛) 𝜎2 𝑋̅ − 𝜇 ∼ 𝑇(𝑛) 𝑆𝜇 /√𝑛 • Two-tailed 𝜎 𝜎 𝜇 − 𝑧𝛼 < 𝑋̅ < 𝜇 + 𝑧𝛼 2 √𝑛 2 √𝑛 𝜎 Right-tailed: 𝜇 − 𝑧𝛼 𝑛 < 𝑋̅ • Left-tailed: 𝑋̅ < 𝜇 + 𝑧𝛼 √ 𝜎 √𝑛 𝑴𝑺 Sample variance Sample proportion ∑𝑛1 𝑋𝑖2 − (𝑋̅)2 𝑛 = ̅̅̅̅ 𝑋 2 − (𝑋̅)2 𝑛 𝑆2 = 𝑀𝑆 𝑛−1 𝑋 ∼ 𝐵(1, 𝑝) 𝑝̂ = 𝑋̅ ∑𝑛1 𝑥𝑖2 − (𝑥̅ )2 𝑛 = ̅̅̅ 𝑥 2 − (𝑥̅ )2 𝑛−1 2 𝜎 𝑛 𝐸(𝑆 2 ) = 𝜎 2 𝐸(𝑝̂ ) = 𝑝 2(𝑛 − 1) 4 𝑿 ∼ 𝑵(𝝁, 𝝈𝟐 ); unknown 𝝁, 𝝈𝟐 𝜎 𝑛𝑀𝑆 𝑛2 ∼ 𝜒 2 (𝑛 − 1) 𝜎2 2𝜎 4 𝑿 ∼ 𝑵(𝝁, 𝝈𝟐 ); unknown 𝝁, 𝝈𝟐 (𝑛 − 1)𝑆 2 𝑛−1 ∼ 𝜒 2 (𝑛 − 1) 𝜎2 𝑝(1 − 𝑝) 𝑛 𝒏 ≥ 𝟏𝟎𝟎: 𝑝̂ ∼ 𝑁 (𝑝, • Two-tailed 𝜎 2 2(𝑛−1) 𝜎 2 2(𝑛−1) 2 𝜒 <𝑆 < 𝜒 𝑛 − 1 1−𝛼/2 𝑛 − 1 𝛼/2 • 2 Right-tailed: 𝑛−1 𝜒(𝑛−1)1−𝛼 < 𝑆2 • Left-tailed: • Two-tailed 𝑝 − 𝑧𝛼 𝜎𝑝̂ < 𝑝̂ < 𝑝 + 𝑧𝛼 𝜎𝑝̂ 𝑝(1 − 𝑝) ) 𝑛 𝜎2 2 𝜎𝑝̂ = √ 7 𝜎2 2 𝑆 2 < 𝑛−1 𝜒(𝑛−1)𝛼 2 𝑝(1 − 𝑝) 𝑛 𝑝(1−𝑝) • Right-tailed: 𝑝 − 𝑧𝛼 √ • Left-tailed: 𝑝̂ < 𝑝 + 𝑧𝛼 √ 𝑛 < 𝑝̂ 𝑝(1−𝑝) 𝑛 Lecture04 • Estimator (example: 𝑋̅, 𝑆 2 ) is random statistic on random sample, is a random variable • Estimate (example: 𝑥̅ , 𝑠 2 ) is observed value of statistic from observed sample • Criteria for estimator o 𝜃̂ is unbiased estimator of 𝜃 ⇔ 𝐸(𝜃̂) = 𝜃 ⇔ 𝑏𝑖𝑎𝑠 = |𝐸(𝜃̂) − 𝜃| = 0 𝐸(𝜃̂1 ) = 𝐸(𝜃̂2 ) = 𝜃 o { ⇒ 𝜃̂1 is more efficient than 𝜃̂2 𝑉(𝜃̂1 ) < 𝑉(𝜃̂2 ) 𝐸(𝜃̂ ) = 𝜃 o { ⇒ 𝜃̂ is efficient estimator 𝑉(𝜃̂) is minimum among every unbiased estimator MVUE: minimum variance unbiased estimator > BUE: best unbiased estimator o Consistent estimator • Find method: o Percentile matching estimator Using: when parameter could be calculated from percentile / quantile From the population distribution, find the quantile formulas that are expression of parameters Estimate population quantiles by sample quantiles → estimate parameters 𝑄2 , 𝑄1 (𝑜𝑟 𝑄3 ), … o Moment estimator Using: when parameter could be calculated from moments Estimate k parameters by first k moments: estimate 𝐸(𝑋) by 𝑋̅, estimate 𝐸(𝑋 2 ) by ̅̅̅̅ 𝑋2, … Estimate moment → estimate parameter o Maximum likelihood estimator Likelihood function Random variable 𝑋 with parameter 𝜃, random sample 𝑿 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ), then likelihood function is: 𝑛 ∏ 𝑃(𝑋𝑖 , 𝜃) ∶ 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑖=1 𝑛 𝐿(𝑿, 𝜃) = ∏ 𝑓(𝑋𝑖 , 𝜃) ∶ 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠 { 𝑖=1 Maximum likelihood estimator (MLE) MLE of 𝜃 is 𝜃̂ that maximize Likelihood function or logarithm of Likelihood function 𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 or ln (𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 𝜕𝐿(𝑿, 𝜃) = 0 ⇔ 𝜃 = 𝜃̂ 𝜕𝜃 𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 ⇔ 𝜕 2 𝐿(𝑿, 𝜃) | <0 2 ̂ 𝜃=𝜃 { 𝜕𝜃 𝜕 ln(𝐿(𝑿, 𝜃)) = 0 ⇔ 𝜃 = 𝜃̂ 𝜕𝜃 ln (𝐿(𝑿, 𝜃) → 𝑚𝑎𝑥 ⇔ 𝜕 2 ln(𝐿(𝑿, 𝜃)) | <0 2 𝜕𝜃 ̂ { 𝜃=𝜃 8 • Fisher information: 1 𝑛 o Bernoulli distribution: 𝑝̂ is MVUE of p ⇒ 𝐼𝑛 (𝑝) = 𝑉(𝑝̂) = 𝑝(1−𝑝) 1 𝑛 o Normality distribution: 𝑋̅ is MVUE of 𝜇 ⇒ 𝐼𝑛 (𝜇) = 𝑉(𝑋̅) = 𝜎2 o Normality distribution: 𝑆 2 is not MVUE of 𝜎 2 ⇒ 𝐼𝑛 (𝜎 2 ) =? > 9 𝑛−1 2𝜎2 Lecture05 • Confidence interval (C.I) = Interval estimate • Prediction interval (P.I): interval for single random observation with prediction level (1 − 𝛼) • 𝑃(𝐿𝑜𝑤𝑒𝑟 𝐿𝑖𝑚𝑖𝑡 < 𝜃 < 𝑈𝑝𝑝𝑒𝑟 𝐿𝑖𝑚𝑖𝑡) = 1 − 𝛼 1. Mean a. Normality distribution – known 𝝈𝟐 𝜎 Two-sided C.I: 𝑋̅ − 𝑧𝛼/2 × < 𝜇 < 𝑋̅ + 𝑧𝛼/2 × Confidence level = 1−𝛼 Confidence width=𝑤=𝑈𝐿−𝐿𝐿 𝜎 → Shorten: 𝑋̅ ± 𝑀𝐸 𝜎 𝜎 → 𝑤 = 2𝑧𝛼/2 × → 𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟: 𝑀𝐸 = 𝑧𝛼/2 × √𝑛 √𝑛 𝟐 b. Normality distribution – unknown 𝝈 𝑆 𝑆 • Two-sided C.I: 𝑋̅ − 𝑡(𝑛−1)𝛼/2 × 𝑛 < 𝜇 < 𝑋̅ + 𝑡(𝑛−1)𝛼/2 × 𝑛 → Shorten: 𝑋̅ ± 𝑀𝐸 √𝑛 √𝑛 √ → 𝑤 = 2𝑡(𝑛−1)𝛼/2 × 𝑆 √ → 𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟: 𝑀𝐸 = 𝑡(𝑛−1)𝛼/2 × • √𝑛 Right-sided C.I: 𝑋̅ − 𝑡(𝑛−1)𝛼 × • Left-sided C.I: • P.I: 𝑆 √𝑛 𝑆 √𝑛 <𝜇 𝜇 < 𝑋̅ + 𝑡(𝑛−1)𝛼 × 𝑆 √𝑛 1 𝑋̅ ± 𝑡(𝑛−1)𝛼/2 × 𝑆 × √1 + 𝑛 2. Variance: Normality distribution – unknown 𝝁 • Two-sided C.I: • Right-sided C.I: • Left-sided C.I: (𝑛−1)𝑆 2 2 𝜒(𝑛−1)𝛼/2 (𝑛−1)𝑆 2 2 𝜒(𝑛−1)𝛼 (𝑛−1)𝑆 2 < 𝜎 2 < 𝜒2 (𝑛−1)1−𝛼/2 < 𝜎2 (𝑛−1)𝑆 2 𝜎 2 < 𝜒2 (𝑛−1)1−𝛼 3. Proportion a. 𝒏 ≥ 𝟏𝟎𝟎 • 𝑝̂(1−𝑝̂) Two-sided C.I: 𝑝̂ − 𝑧𝛼 × √ 2 → 𝑤 = 2𝑧𝛼/2 × √ 𝑛 𝑝̂(1−𝑝̂) < 𝑝 < 𝑝̂ + 𝑧𝛼/2 × √ 𝑛 → Shorten: 𝑝̂ ± 𝑀𝐸 𝑝̂ (1 − 𝑝̂ ) 𝑝̂ (1 − 𝑝̂ ) → 𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟: 𝑀𝐸 = 𝑧𝛼/2 × √ 𝑛 𝑛 𝑝̂(1−𝑝̂) • Right-sided C.I: 𝑝̂ − 𝑧𝛼 × √ • Left-sided C.I: 𝑛 <𝑝 𝑝̂(1−𝑝̂) 𝑝 < 𝑝̂ + 𝑧𝛼 × √ b. 𝒏 < 𝟏𝟎𝟎 Two-sided C.I: 10 𝑛 Lecture06 1. Test for Normal mean – known 𝝈 Hypethesis pair Statistic 𝐻 : 𝜇 = 𝜇0 • 𝑋̅ → 𝑋̅𝑠𝑡𝑎𝑡 = 𝑥̅ { 0 𝐻1 : 𝜇 > 𝜇0 𝑋̅ −𝜇 • 𝑍 = 𝜎/ 𝑛0 → 𝑍𝑠𝑡𝑎𝑡 𝐻0 : 𝜇 ≤ 𝜇0 √ 𝑜𝑟 { 𝐻1 : 𝜇 > 𝜇0 (𝐻0 true → 𝑍 ∼ 𝑁(0; 1)) 𝐻0 : 𝜇 = 𝜇0 𝐻1 : 𝜇 < 𝜇0 𝐻 : 𝜇 ≥ 𝜇0 𝑜𝑟 { 0 𝐻1 : 𝜇 < 𝜇0 { { 𝐻0 : 𝜇 = 𝜇0 𝐻1 : 𝜇 ≠ 𝜇0 𝜷 = P(Error Type 2); 𝝁 = 𝝁𝟏 𝜎 𝜇 = 𝜇1 > 𝜇0 ; 𝑐 = 𝜇0 + 𝑧𝛼 √𝑛 𝜇0 − 𝜇1 𝜎2 𝛽 = 𝑃 [𝑋̅ ≤ 𝑐|𝑋̅ ∼ 𝑁 (𝜇1 , 𝑛 )] = 𝑃 [𝑍 ≤ 𝑧𝛼 + ] 𝜎/√𝑛 𝜎 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 ) 𝜇 = 𝜇1 < 𝜇0 ; 𝑐 = 𝜇0 − 𝑧𝛼 √𝑛 = 𝑃(𝑍 > −𝑍𝑠𝑡𝑎𝑡 ) 𝜇0 − 𝜇1 𝜎2 𝛽 = 𝑃 [𝑋̅ ≥ 𝑐|𝑋̅ ∼ 𝑁 (𝜇1 , 𝑛 )] = 𝑃 [𝑍 ≥ −𝑧𝛼 + ] 𝜎/√𝑛 Reject region 𝜎 𝑥̅ > 𝜇0 + 𝑧𝛼 √𝑛 ⇔ 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 𝑥̅ < 𝜇0 − 𝑧𝛼 ⇔ 𝑍𝑠𝑡𝑎𝑡 P-value 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 ) 𝜎 √𝑛 < −𝑧𝛼 |𝑥̅ − 𝜇0 | > 𝑧𝛼/2 𝜎 √𝑛 2𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |) 𝛽 = 𝑃[𝑋̅ ≤ 𝑐1|𝜇 = 𝜇1] + 𝑃[𝑋̅ ≥ 𝑐2 |𝜇 = 𝜇1 ] ⇔ |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2 𝜎 𝑥̅ > 𝜇0 + 𝑧𝛼/2 = 𝑐1 √𝑛 |𝑥̅ − 𝜇0 | > 𝑧𝛼/2 ⇔ [ 𝜎 √𝑛 𝑥̅ < 𝜇0 − 𝑧𝛼 = 𝑐2 2 √𝑛 𝜎 2. Test for Normal mean – unknown 𝝈 Hypetheses Statistic pair 𝐻 : 𝜇 = 𝜇0 • 𝑋̅ → 𝑋̅𝑠𝑡𝑎𝑡 = 𝑥̅ { 0 𝐻1 : 𝜇 > 𝜇0 𝑋̅ −𝜇 • 𝑇 = 𝑆/ 𝑛0 → 𝑇𝑠𝑡𝑎𝑡 𝐻0 : 𝜇 ≤ 𝜇0 √ 𝑜𝑟 { 𝐻1 : 𝜇 > 𝜇0 (𝐻0 true → 𝑇 ∼ 𝑇(𝑛 − 1)) 𝐻0 : 𝜇 = 𝜇0 𝐻1 : 𝜇 < 𝜇0 𝐻 : 𝜇 ≥ 𝜇0 𝑜𝑟 { 0 𝐻1 : 𝜇 < 𝜇0 𝐻 : 𝜇 = 𝜇0 { 0 𝐻1 : 𝜇 ≠ 𝜇0 { Reject region 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑛−1)𝛼 P-value P(ET.2); 𝝁 = 𝝁𝟏 𝑆 𝑃[𝑇(𝑛 − 1) > 𝑇𝑠𝑡𝑎𝑡 ] 𝜇 = 𝜇1 > 𝜇0 ; 𝑐 = 𝜇0 + 𝑡(𝑛−1)𝛼 𝑛 > 30 ⇒ 𝑇(𝑛 − 1) ≈ 𝑁(0,1) √𝑛 2 𝜎 𝑛 < 30 → 𝐸𝑥𝑐𝑒𝑙, 𝑅 𝛽 = 𝑃 [𝑋̅ < 𝑐|𝑋̅ ∼ 𝑁 (𝜇1 , )] 𝑛 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑛−1)𝛼 𝑃[𝑇(𝑛 − 1) < 𝑇𝑠𝑡𝑎𝑡 ] = 𝑃[𝑇(𝑛 − 1) > −𝑇𝑠𝑡𝑎𝑡 ] 𝛽 =? |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛−1)𝛼/2 2 × 𝑃[𝑇(𝑛 − 1) > |𝑇𝑠𝑡𝑎𝑡 |] 𝛽 =? 11 3. Test for Normal variance Hypetheses pair 𝐻 : 𝜎 = 𝜎02 { 0 2 𝐻1 : 𝜎 > 𝜎02 𝐻 : 𝜎2 ≤ 𝑜𝑟 { 0 2 𝐻1 : 𝜎 > 𝐻 : 𝜎 2 = 𝜎02 { 0 2 𝐻1 : 𝜎 < 𝜎02 𝐻0 : 𝜎 2 ≥ 𝑜𝑟 { 𝐻1 : 𝜎 2 < 𝐻 : 𝜎 2 = 𝜎02 { 0 2 𝐻1 : 𝜎 ≠ 𝜎02 2 Statistic • 𝜎02 𝜎02 𝜒2 = Reject region (𝑛−1)𝑆 2 𝜎02 2 2 → 𝜒𝑠𝑡𝑎𝑡 2 𝜒𝑠𝑡𝑎𝑡 > 2 𝜒(𝑛−1)𝛼 (𝐻0 true → 𝜒 ∼ 𝜒 2 (𝑛 − 1)) P-value 2 𝑃[𝜒 2 (𝑛 − 1) > 𝜒𝑠𝑡𝑎𝑡 ] 𝐸𝑥𝑐𝑒𝑙, 𝑅 2 2 𝜒𝑠𝑡𝑎𝑡 < 𝜒(𝑛−1)1−𝛼 2 ] 𝑃[𝜒 2 (𝑛 − 1) < 𝜒𝑠𝑡𝑎𝑡 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛−1)𝛼/2 2 𝐼𝑓 𝑠 2 > 𝜎02 → 𝑃 − 𝑣𝑎𝑙𝑢𝑒 = 2 × 𝑃[𝜒 2 (𝑛 − 1) > 𝜒𝑠𝑡𝑎𝑡 ] 2 2 2 2 𝐼𝑓 𝑠 < 𝜎0 → 𝑃 − 𝑣𝑎𝑙𝑢𝑒 = 2 × 𝑃[𝜒 (𝑛 − 1) < 𝜒𝑠𝑡𝑎𝑡 ] 𝜎02 𝜎02 12 4. Test for population proprtion, 𝒏 ≥ 𝟏𝟎𝟎 (large sample) (slide157) Hypethesis pair 𝐻 : 𝑝 = 𝑝0 { 0 𝐻1 : 𝑝 > 𝑝0 𝐻 : 𝑝 ≤ 𝑝0 𝑜𝑟 { 0 𝐻1 : 𝑝 > 𝑝0 Statistic • • Reject region 𝑝̂ → 𝑝̂𝑠𝑡𝑎𝑡 𝑍= 𝑝̂−𝑝0 √𝑝0 (1−𝑝0 )/√𝑛 → 𝑍𝑠𝑡𝑎𝑡 √𝑝0 (1 − 𝑝0 ) 𝐻0 : 𝑝 = 𝑝0 𝐻1 : 𝑝 < 𝑝0 𝐻 : 𝑝 ≥ 𝑝0 𝑜𝑟 { 0 𝐻1 : 𝑝 < 𝑝0 𝑝̂ < 𝑝0 − 𝑧𝛼 𝑐 = 𝑝0 + 𝑧𝛼 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 ) = 𝑃(𝑍 > −𝑍𝑠𝑡𝑎𝑡 ) 𝑝 = 𝑝1 > 𝑝0 ; 𝑐 = 𝑝0 − 𝑧𝛼 √𝑛 √𝑝0 (1 − 𝑝0 ) |𝑝̂ − 𝑝0 | > 𝑧𝛼/2 √𝑛 2𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |) 𝛽 =? ⇔ |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2 5. Test for population proprtion, small sample Hyp. pair 𝐻 : 𝑝 = 𝑝0 𝐻 : 𝑝 ≤ 𝑝0 { 0 𝑜𝑟 { 0 𝐻1 : 𝑝 > 𝑝0 𝐻1 : 𝑝 > 𝑝0 { 𝐻0 : 𝑝 = 𝑝0 𝐻 : 𝑝 ≥ 𝑝0 𝑜𝑟 { 0 𝐻1 : 𝑝 < 𝑝0 𝐻1 : 𝑝 < 𝑝0 { 𝐻0 : 𝑝 = 𝑝0 𝐻1 : 𝑝 ≠ 𝑝0 Stat. 𝑓𝑟𝑒𝑞. Reject H0 𝑓𝑟𝑒𝑞. ≥ 𝑐𝑟𝑖𝑡. 𝑃(𝑋 ≥ 𝑐𝑟𝑖𝑡. |𝑝 = 𝑝0) < 𝛼 𝑓𝑟𝑒𝑞. ≤ 𝑐𝑟𝑖𝑡. 𝑃(𝑋 ≤ 𝑐𝑟𝑖𝑡. |𝑝 = 𝑝0) < 𝛼 𝑓𝑟𝑒𝑞. ≥ 𝑐𝑟𝑖𝑡1 𝑜𝑟 𝑓𝑟𝑒𝑞. ≤ 𝑐𝑟𝑖𝑡2 𝑃(𝑋 ≥ 𝑐𝑟𝑖𝑡1 |𝑝 = 𝑝0) < 𝛼/2 𝑃(𝑋 ≤ 𝑐𝑟𝑖𝑡1 |𝑝 = 𝑝0) < 𝛼/2 13 √𝑝0 (1 − 𝑝0 ) √𝑛 𝑝 (1 − 𝑝 ) 𝛽 = 𝑃 [𝑝̂ ≥ 𝑐|𝑝̂ ∼ 𝑁 (𝑝1 , 1 𝑛 1 )] √𝑝0 (1 − 𝑝0 ) √𝑛 √𝑝0 (1 − 𝑝0 ) √𝑛 𝑝 (1 − 𝑝 ) 𝛽 = 𝑃 [𝑝̂ ≤ 𝑐|𝑝̂ ∼ 𝑁 (𝑝1 , 1 𝑛 1 )] ⇔ 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 𝐻0 : 𝑝 = 𝑝0 𝐻1 : 𝑝 ≠ 𝑝0 𝜷 = P(Error Type 2); 𝝁 = 𝝁𝟏 𝑝 = 𝑝1 > 𝑝0 ; ⇔ 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 (𝐻0 true → 𝑍 ∼ 𝑁(0; 1)) { { 𝑝̂ > 𝑝0 + 𝑧𝛼 P-value 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 ) P-value 𝑃(𝑋 ≥ 𝑓𝑟𝑒𝑞.𝑠𝑡𝑎𝑡 ) 𝑃(𝑋 ≤ 𝑓𝑟𝑒𝑞.𝑠𝑡𝑎𝑡 ) Binomal 𝑩(𝒏, 𝒑 = 𝒑𝟎 ) x P(X = x) 0 𝐶𝑛0 𝑝00 (1 − 𝑝0 )𝑛 1 𝐶𝑛1 𝑝01 (1 − 𝑝0 )𝑛−1 … … 𝑖 𝐶𝑛𝑖 𝑝0𝑖 (1 − 𝑝0 )𝑛−𝑖 … … 𝑛−1 𝐶𝑛𝑛−1 𝑝0𝑛−1 (1 − 𝑝0 )1 𝑛 𝐶𝑛𝑛 𝑝0𝑛 (1 − 𝑝0 )0 Lecture07 Inference for 2 means 𝑋1 ∼ 𝑁(𝜇1 , 𝜎12 ), 𝑋2 ∼ 𝑁(𝜇2 , 𝜎22 ) 𝑯𝟎 : 𝝁 𝟏 = 𝝁 𝟐 true Pair sample? 𝒅 = 𝑿𝟏 − 𝑿𝟐 ; 𝒕𝒆𝒔𝒕 ̅ , 𝒔𝒅 Sample: 𝒏, 𝒅 false Hyp. pair Statistic Rejection region ̅ 𝐻0 : 𝜇𝑑 = 0 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑛−1)𝛼 𝑑−0 { 𝑇𝑠𝑡𝑎𝑡 = 𝐻1 : 𝜇𝑑 > 0 𝑠𝑑 /√𝑛 𝐻 :𝜇 = 0 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑛−1)𝛼 { 0 𝑑 𝐻1 : 𝜇𝑑 < 0 𝐻 :𝜇 = 0 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛−1)𝛼/2 { 0 𝑑 𝐻1 : 𝜇𝑑 ≠ 0 𝑠 𝐻0 is false → C.I for 𝜇1 − 𝜇2 : 𝑑̅ ± 𝑡(𝑛−1)𝛼/2 𝑑𝑛 𝑃 − 𝑣𝑎𝑙𝑢𝑒 𝑃[𝑇(𝑛 − 1) > 𝑇𝑠𝑡𝑎𝑡 ] 𝑃[𝑇(𝑛 − 1) < 𝑇𝑠𝑡𝑎𝑡 ] 2 × 𝑃[𝑇(𝑛 − 1) > |𝑇𝑠𝑡𝑎𝑡 |] √ Known 𝝈𝟐𝟏 , 𝝈𝟐𝟐 ? true Hyp. pair 𝐻0 : 𝜇1 = 𝜇2 { 𝐻1 : 𝜇1 > 𝜇2 𝐻1 : 𝜇1 < 𝜇2 𝐻1 : 𝜇1 ≠ 𝜇2 Test Stat. ̅̅̅ 𝑋1 − ̅̅̅ 𝑋2 𝑍= 𝜎2 √ 1 𝑛1 + ∼ 𝑁(0,1) 𝜎22 𝑛2 Reject Region 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 P-value 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 ) 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 ) 2 × 𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |) 𝐻0 is false → C.I for 𝜇1 − 𝜇2 : … false Inference for 2 variances Hyp. pair 𝐻 : 𝜎 2 = 𝜎22 { 0 12 𝐻1 : 𝜎1 ≠ 𝜎22 𝐻 : 𝜎 2 = 𝜎22 { 0 12 𝐻1 : 𝜎1 > 𝜎22 𝐻 : 𝜎 2 = 𝜎22 { 0 12 𝐻1 : 𝜎1 < 𝜎22 Stat. 𝐹𝑠𝑡𝑎𝑡 𝐻0 is false → C.I for 𝑆12 = 2 𝑆2 𝝈𝟐𝟏 = 𝝈𝟐𝟐 ? Reject Region 𝐹𝑠𝑡𝑎𝑡 > 𝑓(𝑛1 −1,𝑛2 −1)𝛼/2 or 𝐹𝑠𝑡𝑎𝑡 < 𝑓(𝑛1 −1,𝑛2 −1)1−𝛼/2 𝐹𝑠𝑡𝑎𝑡 > 𝑓(𝑛1 −1,𝑛2 −1)𝛼 true false Hyp. pair 𝐻 : 𝜇 = 𝜇2 { 0 1 𝐻1 : 𝜇1 > 𝜇2 𝐻1 : 𝜇1 < 𝜇2 𝐻1 : 𝜇1 ≠ 𝜇2 Reject Region 𝑇= P ̅̅̅ 𝑋1 − ̅̅̅ 𝑋2 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑛1 +𝑛2 −2)𝛼 𝑃(𝑇 > 𝑇𝑠𝑡𝑎𝑡 ) 𝑆2 𝑆2 √ 𝑝+ 𝑝 𝑛1 𝑛2 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑛1 +𝑛2 −2)𝛼 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑛1 +𝑛2 −2)𝛼/2 𝑃(𝑇 < 𝑇𝑠𝑡𝑎𝑡 ) 2 × 𝑃(𝑇 > |𝑇𝑠𝑡𝑎𝑡 |) ∼ 𝑇(𝑛1 + 𝑛2 − 2) (𝑛1 − + (𝑛2 − 1)𝑆22 𝑆𝑝2 = 𝑛1 + 𝑛2 − 2 1)𝑆12 Hyp. pair 𝐻 : 𝜇 = 𝜇2 { 0 1 𝐻1 : 𝜇1 > 𝜇2 𝐻1 : 𝜇1 < 𝜇2 𝐻1 : 𝜇1 ≠ 𝜇2 𝐹𝑠𝑡𝑎𝑡 < 𝑓(𝑛1 −1,𝑛2 −1)1−𝛼 𝜎12 : 𝜎22 Stat. … Stat. 𝑇= ̅̅̅ 𝑋1 − ̅̅̅ 𝑋2 𝑆2 𝑆2 √ 1+ 2 𝑛1 𝑛2 ∼ 𝑇(𝑑𝑓) 𝑑𝑓 = 14 (𝑆12 /𝑛1 + 𝑆22 /𝑛2 )2 (𝑆12 /𝑛1 )2 (𝑆22 /𝑛2 )2 𝑛1 − 1 + 𝑛2 − 1 Reject region 𝑇𝑠𝑡𝑎𝑡 > 𝑡(𝑑𝑓)𝛼 P-value 𝑃(𝑇 > 𝑇𝑠𝑡𝑎𝑡 ) 𝑇𝑠𝑡𝑎𝑡 < −𝑡(𝑑𝑓)𝛼 |𝑇𝑠𝑡𝑎𝑡 | > 𝑡(𝑑𝑓)𝛼/2 𝑃(𝑇 < 𝑇𝑠𝑡𝑎𝑡 ) 2 × 𝑃(𝑇 > |𝑇𝑠𝑡𝑎𝑡 |) Inference for 2 proportions Hyp. pair 𝐻 : 𝑝 = 𝑝2 { 0 1 𝐻1 : 𝑝1 > 𝑝2 𝐻1 : 𝑝1 < 𝑝2 𝐻1 : 𝑝1 ≠ 𝑝2 Stat. 𝑝̂1 − 𝑝̂ 2 𝑍= 1 1 √𝑝̅ (1 − 𝑝̅ ) ( + ) 𝑛1 𝑛2 𝑛1 𝑝̂1 + 𝑛1 𝑝̂2 𝑝̅ = 𝑛1 + 𝑛2 𝑝̂1 (1−𝑝̂1 ) C.I: 𝑝1 − 𝑝2 ∈ (𝑝̂1 − 𝑝̂ 2 ) ± 𝑧𝛼 √ 2 𝑛1 + ∼ 𝑁(0,1) Reject Region 𝑍𝑠𝑡𝑎𝑡 > 𝑧𝛼 P-value 𝑃(𝑍 > 𝑍𝑠𝑡𝑎𝑡 ) 𝑍𝑠𝑡𝑎𝑡 < −𝑧𝛼 |𝑍𝑠𝑡𝑎𝑡 | > 𝑧𝛼/2 𝑃(𝑍 < 𝑍𝑠𝑡𝑎𝑡 ) 2 × 𝑃(𝑍 > |𝑍𝑠𝑡𝑎𝑡 |) 𝑝̂2 (1−𝑝̂2 ) 𝑛2 Correlation test (formular and table) 15 Lecture 08: ANOVA: to test for equality of means 1. One - factor ANOVA 16 2. Two - factor ANOVA without interaction 17 3. Two - factor ANOVA with interaction 18 Lecture 09 1. Chi-squared test 2. Independent test (easy and common) 19 Can be proved: 20 2 𝜒𝑠𝑡𝑎𝑡 = 𝑛 ( ∑𝑖 ∑𝑗 2 𝐹𝑖𝑗 𝑅𝑖 𝐶𝑗 − 1) 3. Rank test Critical value 21 4. Nomality test Jarque-Bera test (easy and common) 22

Statistics Lecture Notes: Descriptive & Inferential Analysis

Products

Support

Statistics Lecture Notes: Descriptive & Inferential Analysis

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib