Uploaded by Nhi Nguyα»…n

Actuary Math.Stat. Lec1-9

advertisement
Lecture 01, 02
1. Statistics
Descriptive statistics
orginize
summarize } data?collect?n,N?
present
• The average number of students in a class at White
Oak University is 22.6
• Last year’s total attendance at Long Run High
School’s football games was 8235
2. Types of Variable
Qualitative (Categorical)
Nominal
Ordinal
Incompareable
Compareable
Cannot calculated
Name, address, career,..
Rank, Size, …
Binary: 2 values; male/female; yes/no; …
3. Level of measurement
Qualitative (Categorical)
Nominal
Ordinal
Names/label
Nominal + rank/order
Inferential statistics
predict
forecast} uncertain
verify
• A recent study showed that eating garlic can lower
blood pressure
• It is predicted that the average number of
automobiles each household owns will increase
next year
Quantitative (Scale)
Discrete
Countable
Calculate: +, Score, The number of …
Continuous
Uncountable
Calculate: +, -, x, :, …
Height, Temperature, …
Quantitative (Scale)
Interval: Ordinal + interval between 2 steps is equal
Score, Temperatue, …
Ratio: Interval + ratio of 2 values is meaningful
Height, Time, …
4. Tabular
Qualitative (Categorical): nominal / ordinal
• Frequency distribution tabular
• Relative freq. dis.
• Percent freq. dis. (%)
• Cross-tabular / row and column …
Quantitative (Scale)
• Frequency distribution tabular
• Relative freq. dis. / Percent freq. dis. (%)
• Cummulative freq. dis.
• Cummulative relative freq. dis.
• Cross-tabular / row and column …
5. Graphics
Qualitative (Categorical)
• Pie chart
• Bar/column chart
• Clustered bar/column chart
• Stacked bar/column chart
Quantitative (Scale)
• Bar/column chart
• Clustered bar/column chart
• Stacked bar/column chart
• Group chart
• Histogram / Histogram and cummulative
• Dot Plot; Line; …
• Scatter Plot (→ Correlation)
• Bubble chart
1
6. Data
Population
(π‘₯1 , π‘₯2 , … , π‘₯𝑁 )
Listed
Frequency
tabular
Value
Freq.
Grouped
data
Sample
(π‘₯1 , π‘₯2 , … , π‘₯𝑛 )
π‘₯1
𝑓1
π‘₯2
𝑓2
π‘Ž0 − π‘Ž1
π‘Ž0 + π‘Ž1
2
Freq.
𝑓1
7. Measures of Locations
Value
π‘₯𝑖
Mean
…
…
π‘Ž1 − π‘Ž2
π‘Ž1 + π‘Ž2
2
𝑓2
Median
Mode
Value
Freq.
π‘ŽπΎ−1 − π‘ŽπΎ
π‘ŽπΎ−1 + π‘ŽπΎ
2
𝑓𝐾
Value π‘Ž0 − π‘Ž1 π‘Ž1 − π‘Ž2 … π‘ŽπΎ−1 − π‘Žπ‘˜
π‘Ž0 + π‘Ž1 π‘Ž1 + π‘Ž2 … π‘ŽπΎ−1 + π‘Žπ‘˜
π‘₯𝑖
2
2
2
Freq.
𝑓1
𝑓2
…
π‘“π‘˜
…
…
…
π‘₯1
𝑓1
Arithmetic mean: 𝑴𝒆𝒂𝒏 =
β–ͺ
β–ͺ
Same unit with X
“Sensitive” to any change in element’s value
Μ…=
Sample: 𝒙
𝑡
β–ͺ
β–ͺ
Geometric mean: = (π‘₯1 × π‘₯2 × … × π‘₯𝑆𝑖𝑧𝑒
is middle value of an ordinal data
β–ͺ
is value at position of
β–ͺ
β–ͺ
β–ͺ
β–ͺ
β–ͺ
Population: π‘šπ‘’ ; Sample: π‘₯Μƒ
is not affected by extreme value
useful to compare data with outlier
is most freq. value
may have 0, 1 or > 1 mode
2
…
…
π‘₯π‘˜
π‘“π‘˜
π‘Ίπ’Šπ’›π’†
∑𝑡
π’Š π’™π’Š
𝑛+1
π‘₯2
𝑓2
π‘Ίπ’–π’Ž 𝒐𝒇 𝒗𝒂𝒍𝒖𝒆𝒔
β–ͺ
Population: 𝝁 =
Central
tendency
π‘₯𝐾
𝑓𝐾
)1/𝑆𝑖𝑧𝑒
∑𝒏
π’Š π’™π’Š
𝒏
(return/increase …over …year)
= (𝑛 + 1) × 0.5
π‘˜
β–ͺ
Ordinal data (π‘˜ π‘‘β„Ž value is quantile level 𝑛+1)
β–ͺ
β–ͺ
Sort Smallest → Largest
Quantile level 𝛽 là π‘žπ›½ ; π‘ž0 = π‘₯π‘šπ‘–π‘› ; π‘ž1 = π‘₯π‘šπ‘Žπ‘₯
Position of π‘žπ›½ is (𝑛 + 1)𝛽 = {𝑖𝑛𝑑, 𝑑𝑒𝑐}
π‘žπ›½ = π‘₯𝑖𝑛𝑑 + 𝑑𝑒𝑐 × (π‘₯𝑖𝑛𝑑+1 − π‘₯𝑖𝑛𝑑 )
Quantile
β–ͺ
β–ͺ
β–ͺ
β–ͺ
β–ͺ
β–ͺ
3 quartiles divide data into 4 equal parts
𝑄1 (π‘™π‘œπ‘€π‘’π‘Ÿ π‘“π‘œπ‘’π‘Ÿπ‘‘β„Ž) = π‘ž0.25
𝑄2 (π‘šπ‘’π‘‘π‘–π‘Žπ‘›)
= π‘ž0.5
𝑄3 (π‘’π‘π‘π‘’π‘Ÿ π‘“π‘œπ‘’π‘Ÿπ‘‘β„Ž) = π‘ž0.75
Quartiles
5 key-point: π‘₯π‘šπ‘–π‘› = π‘ž0 ; 𝑄1 = π‘ž0.25 ; 𝑄2 = π‘šπ‘’π‘‘π‘–π‘Žπ‘› = π‘ž0.5 ; 𝑄3 = π‘ž0.75 ; π‘₯π‘šπ‘Žπ‘₯ = π‘ž1
𝐼𝑄𝑅 = 𝑄3 − 𝑄1
π‘‚π‘’π‘‘π‘™π‘–π‘’π‘Ÿ ∉ (𝑄1 − 1,5 × πΌπ‘„π‘…, 𝑄3 + 1,5 × πΌπ‘„π‘…)
𝐸π‘₯π‘‘π‘Ÿπ‘’π‘šπ‘’ π‘œπ‘’π‘‘π‘™π‘–π‘’π‘Ÿ ∉ (𝑄1 − 3 × πΌπ‘„π‘…, 𝑄3 + 3 × πΌπ‘„π‘…)
Quintiles
4 quintiles divide data into 5 equal parts
Deciles
9 deciles divide data into 10 equal parts
Percentiles 99 percentiles divide data into 100 equal parts
2
8. Measures of Variability
Population
Sample
Range
𝑅 = π‘Š = π‘₯π‘šπ‘Žπ‘₯ − π‘₯π‘šπ‘–π‘› : width of interval cover 100% values
Interquartile 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 : width of interval cover 100% values
range
Forth spread: 𝑓𝑠
𝟐
∑π’πŸ(π’™π’Š − 𝒙
∑𝑡
Μ…)𝟐 𝑺𝒙𝒙
𝑺𝑿𝑿
𝟏 (π’™π’Š − 𝝁)
π’”πŸ =
=
𝝈𝟐 =
=
𝒏−𝟏
𝒏
𝑡
𝑡
𝟐
𝑡 𝟐
∑ π’™π’Š
∑𝟏 𝒙 π’Š
𝒏
𝒏
Μ…)𝟐 ) =
=
(
− (𝒙
π’Žπ’”
=
− 𝝁𝟐
𝒏−𝟏 𝒏
𝒏−𝟏
𝑡
Variance
β–ͺ Absolute variability
β–ͺ unit of variance is squared unit of X
β–ͺ Var(X) > Var(Y) → X is variability, dispersion, fluctuate than Y // Y is more stable,
concentrated than X
Standard
deviation
(S.D)
𝜎 = √𝜎 2
𝑠 = √𝑠 2
Same unit with X
Absolute variability
𝜎
𝑠
Coefficient
𝐢𝑉 = × 100(%)
𝐢𝑉 = × 100(%)
πœ‡
π‘₯Μ…
of variation
β–ͺ Unit: %
β–ͺ CV is used to compare the variability of variables with different units
β–ͺ Relative variability
9. Measures of Shape
Population
Sample
𝑁
3
∑𝑛1(π‘₯𝑖 − π‘₯Μ… )3 /𝑛
∑1 (π‘₯𝑖 − πœ‡) /𝑁
π‘†π‘˜π‘’π‘€
=
π‘†π‘˜π‘’π‘€ =
𝑠3
𝜎3
Skewness
Mean < Median
4
∑𝑁
1 (π‘₯𝑖 − πœ‡) /𝑁
πΎπ‘’π‘Ÿπ‘‘ =
𝜎4
Mean = Median
Kurtosis
3
Median < Mean
∑𝑛1(π‘₯𝑖 − π‘₯Μ… )4 /𝑛
πΎπ‘’π‘Ÿπ‘‘ =
𝑠4
π‘₯ −π‘šπ‘’π‘Žπ‘›
𝑖
10. Standardized value 𝑧𝑖 = π‘ π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘
π‘‘π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘›
Example 1
Population
(2, 2, 3, 3, 3, 4, 4, 4, 4, 5)
Sample
(2, 3, 2, 4, 5)
4
2
3
2
1
1
1
3
4
5
1
2
3
10
𝑁 = 10; ∑ π‘₯𝑖 = 34;
Mean
Median
Mode
Quartiles
Range
Interquartile
range
Variance
1
10
∑1 π‘₯𝑖
4
2
5
10
∑ π‘₯𝑖2
1
5
𝑛 = 5; ∑ π‘₯𝑖 = 16; ∑ π‘₯𝑖2 = 58
= 124
34
= 3.4
𝑁
10
π‘₯5 + π‘₯6 3 + 4
π‘šπ‘’ =
=
= 3.5
2
2
πœ‡=
5
=
π‘₯Μ… =
1
5
∑1 π‘₯𝑖
1
=
16
= 3.2
5
π‘€π‘œπ‘‘π‘’ = 4
• (𝑁 + 1) × 0.25 = 11 × 0.25 = 2.75
𝑄1 = π‘ž0.25 = π‘₯2 + 0.75(π‘₯2+1 − π‘₯2 )
= 2 + 0.75(3 − 2) = 2.75
• 𝑄2 = π‘ž0.5 = π‘šπ‘’ = 3.5
• (𝑁 + 1) × 0.75 = 11 × 0.75 = 8.25
𝑄3 = π‘ž0.75 = π‘₯8 + 0.25(π‘₯9 − π‘₯8 )
= 4 + 0.25(4 − 4) = 4
𝑅 = 5−2 =3
𝐼𝑄𝑅 = 𝑓𝑠 = 𝑄3 − 𝑄1 = 4 − 2.75 = 1.25
𝑛
2; 2; 3; 4; 5
π‘₯Μƒ = π‘₯3 = 3
π‘€π‘œπ‘‘π‘’ = 2
• (𝑛 + 1) × 0.25 = 6 × 0.25 = 1.5
𝑄1 = π‘ž0.25 = π‘₯1 + 0.5(π‘₯2 − π‘₯1 )
= 2 + 0.5(2 − 2) = 2
• 𝑄2 = π‘ž0.5 = π‘šπ‘’ = 3
• (𝑛 + 1) × 0.75 = 6 × 0.75 = 4.5
𝑄3 = π‘ž0.75 = π‘₯4 + 0.5(π‘₯5 − π‘₯4 )
= 4 + 0.5(5 − 4) = 4.5
𝑅 = 5−2 =3
𝐼𝑄𝑅 = 𝑓𝑠 = 𝑄3 − 𝑄1 = 4.5 − 3 = 1.5
124
− 3.42 = 0.84
10
𝜎 = √0.84 = 0.92
5 58
𝑠 2 = ( − 3.22 ) = 1.7
4 5
𝑠 = √1.7 = 1.3
𝜎2 =
Standard
deviation
(S.D)
0.92
1.3
Coefficient
𝐢𝑉 =
× 100(%) = 27,06(%)
𝐢𝑉 =
× 100(%) = 40.6%
3.4
3.2
of variation
Standardized (-1.52; -1.52; -0.43; -0.43; -0.43; 0.65; 0.65; (-0.92; -0.92; -0.15; 0.62; 1.38)
value
0.65; 0.65; 1.74)
−0.72/10
2.88/5
Skewness
π‘ π‘˜π‘’π‘€ =
=
−0.092
π‘ π‘˜π‘’π‘€
=
= 0.26
0.923
1.33
14.832/10
15.056/5
Kurtosis
π‘˜π‘’π‘Ÿπ‘‘ =
= 2.07
π‘˜π‘’π‘Ÿπ‘‘ =
= 1.05
4
0.92
1.34
4
Example 2
Population
Group
2-4
π‘₯𝑖
Freq.
2
π‘₯𝑖
Freq.
4-6
5
6-8
7
8 - 10
1
5
5
7
7
9
1
3
2
(3, 3, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 7, 9)
πœŒπ‘‹π‘Œ =
π‘₯𝑖
Freq.
3
1
4-6
3
6-8
2
5
3
7
2
(3, 5, 5, 5, 7, 7)
Mean
Median
Mode
Quartiles
Range
Interquartile
range
Variance
Standard
deviation
(S.D)
CV
Standardized
value
Skewness
Kurtosis
11. Measures of Relationship
Population
∑(π‘₯𝑖 − πœ‡π‘‹ )(𝑦𝑖 − πœ‡π‘Œ )
Covariance
πΆπ‘œπ‘£(𝑋, π‘Œ) =
𝑁
= πœ‡π‘‹π‘Œ − πœ‡π‘‹ πœ‡π‘Œ
Correlation
Sample
Group
2-4
π‘₯𝑖
Freq.
1
πΆπ‘œπ‘£(𝑋, π‘Œ)
πœŽπ‘‹ πœŽπ‘Œ
5
Sample
∑(π‘₯𝑖 − π‘₯Μ… )(𝑦𝑖 − 𝑦̅)
𝑁
𝑛
(π‘₯
=
Μ…Μ…Μ…Μ…Μ…Μ…
⋅ 𝑦 − π‘₯Μ… ⋅ 𝑦̅)
𝑛−1
πΆπ‘œπ‘£(𝑋, π‘Œ)
π‘Ÿπ‘‹π‘Œ =
𝑠𝑋 π‘ π‘Œ
πΆπ‘œπ‘£(𝑋, π‘Œ) =
Lecture03
1. Population ≡ Random Variable 𝑿
Population mean πœ‡ = 𝐸(𝑋)
Population variance 𝜎 2 = 𝑉(𝑋)
2. Sample: random & observed
𝑿 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ): π‘Ÿπ‘Žπ‘›π‘‘π‘œπ‘š π‘ π‘Žπ‘šπ‘π‘™π‘’ ⇔ {
𝑋1 , 𝑋2 , … , 𝑋𝑛 π‘Žπ‘Ÿπ‘’ π’Šπ‘›π‘‘π‘’π‘π‘’π‘›π‘‘π‘‘π‘’π‘›π‘‘
(𝑋𝑖 π‘Žπ‘Ÿπ‘’ 𝑖𝑖𝑑. )
𝑋1 , 𝑋2 , … , 𝑋𝑛 π‘Žπ‘Ÿπ‘’ π’Šπ‘‘π‘’π‘›π‘‘π‘–π‘π‘Žπ‘™π‘™π‘¦ π’…π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› π‘€π‘–π‘‘β„Ž 𝑋
Observed sample (π‘₯1 , π‘₯2 , … , π‘₯𝑛 )
3. Statistic is a function on random sample: 𝐺 = 𝐺(𝑋1 , 𝑋2 , … , 𝑋𝑛 )
Observed sample → 𝑔 = πΊπ‘ π‘‘π‘Žπ‘‘ = 𝐺(π‘₯1 , π‘₯2 , … , π‘₯𝑛 ): observed value
Ex: (𝑋1 , 𝑋2 , … , 𝑋10 ): random sample; 𝐺 =
𝑋1 +𝑋2 +𝑋3
→ 𝐸(𝐺) =
3
8+7+4
(8; 7; 4; 9; 10; 2; 6; 3; 5; 6) → 𝑔 = πΊπ‘ π‘‘π‘Žπ‘‘ =
3
𝐸(𝑋1 )+𝐸(𝑋2 )+𝐸(𝑋3 )
3
= πœ‡; 𝑉(𝐺) =
𝑉(𝑋1 )+𝑉(𝑋2 )+𝑉(𝑋3 )
32
=
𝜎2
3
= 6.3333
4.
Statistic
Sample
mean
𝑋̅ =
∑𝑛1 𝑋𝑖
𝑛
Obseved
(lec02)
∑𝑛1 π‘₯𝑖
π‘₯Μ… =
𝑛
𝑛
∑1 π‘₯𝑖 𝑛𝑖
π‘₯Μ… =
𝑛
Expectation
𝐸(𝑋̅) = πœ‡
Variance
𝑉(𝑋̅) =
•
𝜎2
𝑛
Known 𝜎 2 :
𝑛 > 30:
∑𝑛𝑖(𝑋𝑖 − πœ‡)2
𝑛
∑𝑛𝑖(π‘₯𝑖 − πœ‡)2
𝑛
𝜎2
•
2𝜎 4
𝑛
6
•
𝑿 ∼ 𝑡(𝝁, 𝝈𝟐 )
𝜎2
→ 𝑋̅ ∼ 𝑁 (πœ‡, )
𝑛
Unknown
π‘ΊπŸπ
Interval for sample …
Related distribution
𝑋̅ −πœ‡
∼ 𝑁(0,1)
𝜎/√𝑛
𝑋̅ −πœ‡
𝜎 2 : 𝑆/ 𝑛 ∼
√
𝑋̅ −πœ‡
𝑇(𝑛 − 1)
∼ 𝑁(0,1)
𝑆/√𝑛
𝟐 ),
𝑿 ∼ 𝑡(𝝁, 𝝈 known 𝝁
π‘›π‘†πœ‡2
∼ πœ’ 2 (𝑛)
𝜎2
𝑋̅ − πœ‡
∼ 𝑇(𝑛)
π‘†πœ‡ /√𝑛
•
Two-tailed
𝜎
𝜎
πœ‡ − 𝑧𝛼
< 𝑋̅ < πœ‡ + 𝑧𝛼
2 √𝑛
2 √𝑛
𝜎
Right-tailed: πœ‡ − 𝑧𝛼 𝑛 < 𝑋̅
•
Left-tailed: 𝑋̅ < πœ‡ + 𝑧𝛼
√
𝜎
√𝑛
𝑴𝑺
Sample
variance
Sample
proportion
∑𝑛1 𝑋𝑖2
− (𝑋̅)2
𝑛
= Μ…Μ…Μ…Μ…
𝑋 2 − (𝑋̅)2
𝑛
𝑆2 =
𝑀𝑆
𝑛−1
𝑋 ∼ 𝐡(1, 𝑝)
𝑝̂ = 𝑋̅
∑𝑛1 π‘₯𝑖2
− (π‘₯Μ… )2
𝑛
= Μ…Μ…Μ…
π‘₯ 2 − (π‘₯Μ… )2
𝑛−1 2
𝜎
𝑛
𝐸(𝑆 2 ) = 𝜎 2
𝐸(𝑝̂ ) = 𝑝
2(𝑛 − 1) 4 𝑿 ∼ 𝑡(𝝁, 𝝈𝟐 ); unknown 𝝁, 𝝈𝟐
𝜎
𝑛𝑀𝑆
𝑛2
∼ πœ’ 2 (𝑛 − 1)
𝜎2
2𝜎 4
𝑿 ∼ 𝑡(𝝁, 𝝈𝟐 ); unknown 𝝁, 𝝈𝟐
(𝑛 − 1)𝑆 2
𝑛−1
∼ πœ’ 2 (𝑛 − 1)
𝜎2
𝑝(1 − 𝑝)
𝑛
𝒏 ≥ 𝟏𝟎𝟎:
𝑝̂ ∼ 𝑁 (𝑝,
•
Two-tailed
𝜎 2 2(𝑛−1)
𝜎 2 2(𝑛−1)
2
πœ’
<𝑆 <
πœ’
𝑛 − 1 1−𝛼/2
𝑛 − 1 𝛼/2
•
2
Right-tailed: 𝑛−1 πœ’(𝑛−1)1−𝛼
< 𝑆2
•
Left-tailed:
•
Two-tailed
𝑝 − 𝑧𝛼 πœŽπ‘Μ‚ < 𝑝̂ < 𝑝 + 𝑧𝛼 πœŽπ‘Μ‚
𝑝(1 − 𝑝)
)
𝑛
𝜎2
2
πœŽπ‘Μ‚ = √
7
𝜎2
2
𝑆 2 < 𝑛−1 πœ’(𝑛−1)𝛼
2
𝑝(1 − 𝑝)
𝑛
𝑝(1−𝑝)
•
Right-tailed: 𝑝 − 𝑧𝛼 √
•
Left-tailed: 𝑝̂ < 𝑝 + 𝑧𝛼 √
𝑛
< 𝑝̂
𝑝(1−𝑝)
𝑛
Lecture04
• Estimator (example: 𝑋̅, 𝑆 2 ) is random statistic on random sample, is a random variable
• Estimate (example: π‘₯Μ… , 𝑠 2 ) is observed value of statistic from observed sample
• Criteria for estimator
o πœƒΜ‚ is unbiased estimator of πœƒ ⇔ 𝐸(πœƒΜ‚) = πœƒ ⇔ π‘π‘–π‘Žπ‘  = |𝐸(πœƒΜ‚) − πœƒ| = 0
𝐸(πœƒΜ‚1 ) = 𝐸(πœƒΜ‚2 ) = πœƒ
o {
⇒ πœƒΜ‚1 is more efficient than πœƒΜ‚2
𝑉(πœƒΜ‚1 ) < 𝑉(πœƒΜ‚2 )
𝐸(πœƒΜ‚ ) = πœƒ
o {
⇒ πœƒΜ‚ is efficient estimator
𝑉(πœƒΜ‚) is minimum among every unbiased estimator
MVUE: minimum variance unbiased estimator > BUE: best unbiased estimator
o Consistent estimator
• Find method:
o Percentile matching estimator
Using: when parameter could be calculated from percentile / quantile
From the population distribution, find the quantile formulas that are expression of parameters
Estimate population quantiles by sample quantiles → estimate parameters
𝑄2 , 𝑄1 (π‘œπ‘Ÿ 𝑄3 ), …
o Moment estimator
Using: when parameter could be calculated from moments
Estimate k parameters by first k moments: estimate 𝐸(𝑋) by 𝑋̅, estimate 𝐸(𝑋 2 ) by Μ…Μ…Μ…Μ…
𝑋2, …
Estimate moment → estimate parameter
o Maximum likelihood estimator
Likelihood function
Random variable 𝑋 with parameter πœƒ, random sample 𝑿 = (𝑋1 , 𝑋2 , … , 𝑋𝑛 ), then likelihood
function is:
𝑛
∏ 𝑃(𝑋𝑖 , πœƒ)
∢ π‘‘π‘–π‘ π‘π‘Ÿπ‘’π‘‘π‘’
𝑖=1
𝑛
𝐿(𝑿, πœƒ) =
∏ 𝑓(𝑋𝑖 , πœƒ) ∢ π‘π‘œπ‘›π‘‘π‘–π‘›π‘’π‘œπ‘’π‘ 
{ 𝑖=1
Maximum likelihood estimator (MLE)
MLE of πœƒ is πœƒΜ‚ that maximize Likelihood function or logarithm of Likelihood function
𝐿(𝑿, πœƒ) → π‘šπ‘Žπ‘₯ or ln (𝐿(𝑿, πœƒ) → π‘šπ‘Žπ‘₯
πœ•πΏ(𝑿, πœƒ)
= 0 ⇔ πœƒ = πœƒΜ‚
πœ•πœƒ
𝐿(𝑿, πœƒ) → π‘šπ‘Žπ‘₯ ⇔ πœ• 2 𝐿(𝑿, πœƒ)
|
<0
2
Μ‚
πœƒ=πœƒ
{ πœ•πœƒ
πœ• ln(𝐿(𝑿, πœƒ))
= 0 ⇔ πœƒ = πœƒΜ‚
πœ•πœƒ
ln (𝐿(𝑿, πœƒ) → π‘šπ‘Žπ‘₯ ⇔
πœ• 2 ln(𝐿(𝑿, πœƒ))
|
<0
2
πœ•πœƒ
Μ‚
{
πœƒ=πœƒ
8
•
Fisher information:
1
𝑛
o Bernoulli distribution: 𝑝̂ is MVUE of p ⇒ 𝐼𝑛 (𝑝) = 𝑉(𝑝̂) = 𝑝(1−𝑝)
1
𝑛
o Normality distribution: 𝑋̅ is MVUE of πœ‡ ⇒ 𝐼𝑛 (πœ‡) = 𝑉(𝑋̅) = 𝜎2
o Normality distribution: 𝑆 2 is not MVUE of 𝜎 2 ⇒ 𝐼𝑛 (𝜎 2 ) =? >
9
𝑛−1
2𝜎2
Lecture05
• Confidence interval (C.I) = Interval estimate
• Prediction interval (P.I): interval for single random observation with prediction level (1 − 𝛼)
•
𝑃(πΏπ‘œπ‘€π‘’π‘Ÿ πΏπ‘–π‘šπ‘–π‘‘ < πœƒ < π‘ˆπ‘π‘π‘’π‘Ÿ πΏπ‘–π‘šπ‘–π‘‘) = 1 − 𝛼
1. Mean
a. Normality distribution – known 𝝈𝟐
𝜎
Two-sided C.I: 𝑋̅ − 𝑧𝛼/2 × < πœ‡ < 𝑋̅ + 𝑧𝛼/2 ×
Confidence level = 1−𝛼
Confidence width=𝑀=π‘ˆπΏ−𝐿𝐿
𝜎
→ Shorten: 𝑋̅ ± 𝑀𝐸
𝜎
𝜎
→ 𝑀 = 2𝑧𝛼/2 ×
→ π‘€π‘Žπ‘Ÿπ‘”π‘–π‘› π‘œπ‘“ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ: 𝑀𝐸 = 𝑧𝛼/2 ×
√𝑛
√𝑛
𝟐
b. Normality distribution – unknown 𝝈
𝑆
𝑆
• Two-sided C.I: 𝑋̅ − 𝑑(𝑛−1)𝛼/2 × π‘› < πœ‡ < 𝑋̅ + 𝑑(𝑛−1)𝛼/2 × π‘› → Shorten: 𝑋̅ ± 𝑀𝐸
√𝑛
√𝑛
√
→ 𝑀 = 2𝑑(𝑛−1)𝛼/2 ×
𝑆
√
→ π‘€π‘Žπ‘Ÿπ‘”π‘–π‘› π‘œπ‘“ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ: 𝑀𝐸 = 𝑑(𝑛−1)𝛼/2 ×
•
√𝑛
Right-sided C.I: 𝑋̅ − 𝑑(𝑛−1)𝛼 ×
•
Left-sided C.I:
•
P.I:
𝑆
√𝑛
𝑆
√𝑛
<πœ‡
πœ‡ < 𝑋̅ + 𝑑(𝑛−1)𝛼 ×
𝑆
√𝑛
1
𝑋̅ ± 𝑑(𝑛−1)𝛼/2 × π‘† × √1 + 𝑛
2. Variance: Normality distribution – unknown 𝝁
•
Two-sided C.I:
•
Right-sided C.I:
•
Left-sided C.I:
(𝑛−1)𝑆 2
2
πœ’(𝑛−1)𝛼/2
(𝑛−1)𝑆 2
2
πœ’(𝑛−1)𝛼
(𝑛−1)𝑆 2
< 𝜎 2 < πœ’2
(𝑛−1)1−𝛼/2
< 𝜎2
(𝑛−1)𝑆 2
𝜎 2 < πœ’2
(𝑛−1)1−𝛼
3. Proportion
a. 𝒏 ≥ 𝟏𝟎𝟎
•
𝑝̂(1−𝑝̂)
Two-sided C.I: 𝑝̂ − 𝑧𝛼 × √
2
→ 𝑀 = 2𝑧𝛼/2 × √
𝑛
𝑝̂(1−𝑝̂)
< 𝑝 < 𝑝̂ + 𝑧𝛼/2 × √
𝑛
→ Shorten: 𝑝̂ ± 𝑀𝐸
𝑝̂ (1 − 𝑝̂ )
𝑝̂ (1 − 𝑝̂ )
→ π‘€π‘Žπ‘Ÿπ‘”π‘–π‘› π‘œπ‘“ π‘’π‘Ÿπ‘Ÿπ‘œπ‘Ÿ: 𝑀𝐸 = 𝑧𝛼/2 × √
𝑛
𝑛
𝑝̂(1−𝑝̂)
•
Right-sided C.I: 𝑝̂ − 𝑧𝛼 × √
•
Left-sided C.I:
𝑛
<𝑝
𝑝̂(1−𝑝̂)
𝑝 < 𝑝̂ + 𝑧𝛼 × √
b. 𝒏 < 𝟏𝟎𝟎
Two-sided C.I:
10
𝑛
Lecture06
1. Test for Normal mean – known 𝝈
Hypethesis pair Statistic
𝐻 : πœ‡ = πœ‡0
• 𝑋̅ → π‘‹Μ…π‘ π‘‘π‘Žπ‘‘ = π‘₯Μ…
{ 0
𝐻1 : πœ‡ > πœ‡0
𝑋̅ −πœ‡
• 𝑍 = 𝜎/ 𝑛0 → π‘π‘ π‘‘π‘Žπ‘‘
𝐻0 : πœ‡ ≤ πœ‡0
√
π‘œπ‘Ÿ {
𝐻1 : πœ‡ > πœ‡0 (𝐻0 true → 𝑍 ∼ 𝑁(0; 1))
𝐻0 : πœ‡ = πœ‡0
𝐻1 : πœ‡ < πœ‡0
𝐻 : πœ‡ ≥ πœ‡0
π‘œπ‘Ÿ { 0
𝐻1 : πœ‡ < πœ‡0
{
{
𝐻0 : πœ‡ = πœ‡0
𝐻1 : πœ‡ ≠ πœ‡0
𝜷 = P(Error Type 2); 𝝁 = 𝝁𝟏
𝜎
πœ‡ = πœ‡1 > πœ‡0 ; 𝑐 = πœ‡0 + 𝑧𝛼
√𝑛
πœ‡0 − πœ‡1
𝜎2
𝛽 = 𝑃 [𝑋̅ ≤ 𝑐|𝑋̅ ∼ 𝑁 (πœ‡1 , 𝑛 )] = 𝑃 [𝑍 ≤ 𝑧𝛼 +
]
𝜎/√𝑛
𝜎
𝑃(𝑍 < π‘π‘ π‘‘π‘Žπ‘‘ )
πœ‡ = πœ‡1 < πœ‡0 ; 𝑐 = πœ‡0 − 𝑧𝛼
√𝑛
= 𝑃(𝑍 > −π‘π‘ π‘‘π‘Žπ‘‘ )
πœ‡0 − πœ‡1
𝜎2
𝛽 = 𝑃 [𝑋̅ ≥ 𝑐|𝑋̅ ∼ 𝑁 (πœ‡1 , 𝑛 )] = 𝑃 [𝑍 ≥ −𝑧𝛼 +
]
𝜎/√𝑛
Reject region
𝜎
π‘₯Μ… > πœ‡0 + 𝑧𝛼
√𝑛
⇔ π‘π‘ π‘‘π‘Žπ‘‘ > 𝑧𝛼
π‘₯Μ… < πœ‡0 − 𝑧𝛼
⇔ π‘π‘ π‘‘π‘Žπ‘‘
P-value
𝑃(𝑍 > π‘π‘ π‘‘π‘Žπ‘‘ )
𝜎
√𝑛
< −𝑧𝛼
|π‘₯Μ… − πœ‡0 | > 𝑧𝛼/2
𝜎
√𝑛
2𝑃(𝑍 > |π‘π‘ π‘‘π‘Žπ‘‘ |)
𝛽 = 𝑃[𝑋̅ ≤ 𝑐1|πœ‡ = πœ‡1] + 𝑃[𝑋̅ ≥ 𝑐2 |πœ‡ = πœ‡1 ]
⇔ |π‘π‘ π‘‘π‘Žπ‘‘ | > 𝑧𝛼/2
𝜎
π‘₯Μ… > πœ‡0 + 𝑧𝛼/2
= 𝑐1
√𝑛
|π‘₯Μ… − πœ‡0 | > 𝑧𝛼/2
⇔ [
𝜎
√𝑛
π‘₯Μ… < πœ‡0 − 𝑧𝛼
= 𝑐2
2 √𝑛
𝜎
2. Test for Normal mean – unknown 𝝈
Hypetheses
Statistic
pair
𝐻 : πœ‡ = πœ‡0
• 𝑋̅ → π‘‹Μ…π‘ π‘‘π‘Žπ‘‘ = π‘₯Μ…
{ 0
𝐻1 : πœ‡ > πœ‡0
𝑋̅ −πœ‡
• 𝑇 = 𝑆/ 𝑛0 → π‘‡π‘ π‘‘π‘Žπ‘‘
𝐻0 : πœ‡ ≤ πœ‡0
√
π‘œπ‘Ÿ {
𝐻1 : πœ‡ > πœ‡0 (𝐻0 true → 𝑇 ∼ 𝑇(𝑛 − 1))
𝐻0 : πœ‡ = πœ‡0
𝐻1 : πœ‡ < πœ‡0
𝐻 : πœ‡ ≥ πœ‡0
π‘œπ‘Ÿ { 0
𝐻1 : πœ‡ < πœ‡0
𝐻 : πœ‡ = πœ‡0
{ 0
𝐻1 : πœ‡ ≠ πœ‡0
{
Reject region
π‘‡π‘ π‘‘π‘Žπ‘‘ > 𝑑(𝑛−1)𝛼
P-value
P(ET.2); 𝝁 = 𝝁𝟏
𝑆
𝑃[𝑇(𝑛 − 1) > π‘‡π‘ π‘‘π‘Žπ‘‘ ]
πœ‡ = πœ‡1 > πœ‡0 ; 𝑐 = πœ‡0 + 𝑑(𝑛−1)𝛼
𝑛 > 30 ⇒ 𝑇(𝑛 − 1) ≈ 𝑁(0,1)
√𝑛
2
𝜎
𝑛 < 30 → 𝐸π‘₯𝑐𝑒𝑙, 𝑅
𝛽 = 𝑃 [𝑋̅ < 𝑐|𝑋̅ ∼ 𝑁 (πœ‡1 , )]
𝑛
π‘‡π‘ π‘‘π‘Žπ‘‘ < −𝑑(𝑛−1)𝛼
𝑃[𝑇(𝑛 − 1) < π‘‡π‘ π‘‘π‘Žπ‘‘ ]
= 𝑃[𝑇(𝑛 − 1) > −π‘‡π‘ π‘‘π‘Žπ‘‘ ]
𝛽 =?
|π‘‡π‘ π‘‘π‘Žπ‘‘ | > 𝑑(𝑛−1)𝛼/2
2 × π‘ƒ[𝑇(𝑛 − 1) > |π‘‡π‘ π‘‘π‘Žπ‘‘ |]
𝛽 =?
11
3. Test for Normal variance
Hypetheses pair
𝐻 : 𝜎 = 𝜎02
{ 0 2
𝐻1 : 𝜎 > 𝜎02
𝐻 : 𝜎2 ≤
π‘œπ‘Ÿ { 0 2
𝐻1 : 𝜎 >
𝐻 : 𝜎 2 = 𝜎02
{ 0 2
𝐻1 : 𝜎 < 𝜎02
𝐻0 : 𝜎 2 ≥
π‘œπ‘Ÿ {
𝐻1 : 𝜎 2 <
𝐻 : 𝜎 2 = 𝜎02
{ 0 2
𝐻1 : 𝜎 ≠ 𝜎02
2
Statistic
•
𝜎02
𝜎02
πœ’2 =
Reject region
(𝑛−1)𝑆 2
𝜎02
2
2
→ πœ’π‘ π‘‘π‘Žπ‘‘
2
πœ’π‘ π‘‘π‘Žπ‘‘
>
2
πœ’(𝑛−1)𝛼
(𝐻0 true → πœ’ ∼ πœ’ 2 (𝑛 − 1))
P-value
2
𝑃[πœ’ 2 (𝑛 − 1) > πœ’π‘ π‘‘π‘Žπ‘‘
]
𝐸π‘₯𝑐𝑒𝑙, 𝑅
2
2
πœ’π‘ π‘‘π‘Žπ‘‘
< πœ’(𝑛−1)1−𝛼
2 ]
𝑃[πœ’ 2 (𝑛 − 1) < πœ’π‘ π‘‘π‘Žπ‘‘
|π‘‡π‘ π‘‘π‘Žπ‘‘ | > 𝑑(𝑛−1)𝛼/2
2
𝐼𝑓 𝑠 2 > 𝜎02 → 𝑃 − π‘£π‘Žπ‘™π‘’π‘’ = 2 × π‘ƒ[πœ’ 2 (𝑛 − 1) > πœ’π‘ π‘‘π‘Žπ‘‘
]
2
2
2
2
𝐼𝑓 𝑠 < 𝜎0 → 𝑃 − π‘£π‘Žπ‘™π‘’π‘’ = 2 × π‘ƒ[πœ’ (𝑛 − 1) < πœ’π‘ π‘‘π‘Žπ‘‘ ]
𝜎02
𝜎02
12
4. Test for population proprtion, 𝒏 ≥ 𝟏𝟎𝟎 (large sample) (slide157)
Hypethesis pair
𝐻 : 𝑝 = 𝑝0
{ 0
𝐻1 : 𝑝 > 𝑝0
𝐻 : 𝑝 ≤ 𝑝0
π‘œπ‘Ÿ { 0
𝐻1 : 𝑝 > 𝑝0
Statistic
•
•
Reject region
𝑝̂ → π‘Μ‚π‘ π‘‘π‘Žπ‘‘
𝑍=
𝑝̂−𝑝0
√𝑝0 (1−𝑝0 )/√𝑛
→ π‘π‘ π‘‘π‘Žπ‘‘
√𝑝0 (1 − 𝑝0 )
𝐻0 : 𝑝 = 𝑝0
𝐻1 : 𝑝 < 𝑝0
𝐻 : 𝑝 ≥ 𝑝0
π‘œπ‘Ÿ { 0
𝐻1 : 𝑝 < 𝑝0
𝑝̂ < 𝑝0 − 𝑧𝛼
𝑐 = 𝑝0 + 𝑧𝛼
𝑃(𝑍 < π‘π‘ π‘‘π‘Žπ‘‘ )
= 𝑃(𝑍 > −π‘π‘ π‘‘π‘Žπ‘‘ ) 𝑝 = 𝑝1 > 𝑝0 ;
𝑐 = 𝑝0 − 𝑧𝛼
√𝑛
√𝑝0 (1 − 𝑝0 )
|𝑝̂ − 𝑝0 | > 𝑧𝛼/2
√𝑛
2𝑃(𝑍 > |π‘π‘ π‘‘π‘Žπ‘‘ |)
𝛽 =?
⇔ |π‘π‘ π‘‘π‘Žπ‘‘ | > 𝑧𝛼/2
5. Test for population proprtion, small sample
Hyp. pair
𝐻 : 𝑝 = 𝑝0
𝐻 : 𝑝 ≤ 𝑝0
{ 0
π‘œπ‘Ÿ { 0
𝐻1 : 𝑝 > 𝑝0
𝐻1 : 𝑝 > 𝑝0
{
𝐻0 : 𝑝 = 𝑝0
𝐻 : 𝑝 ≥ 𝑝0
π‘œπ‘Ÿ { 0
𝐻1 : 𝑝 < 𝑝0
𝐻1 : 𝑝 < 𝑝0
{
𝐻0 : 𝑝 = 𝑝0
𝐻1 : 𝑝 ≠ 𝑝0
Stat.
π‘“π‘Ÿπ‘’π‘ž.
Reject H0
π‘“π‘Ÿπ‘’π‘ž. ≥ π‘π‘Ÿπ‘–π‘‘.
𝑃(𝑋 ≥ π‘π‘Ÿπ‘–π‘‘. |𝑝 = 𝑝0) < 𝛼
π‘“π‘Ÿπ‘’π‘ž. ≤ π‘π‘Ÿπ‘–π‘‘.
𝑃(𝑋 ≤ π‘π‘Ÿπ‘–π‘‘. |𝑝 = 𝑝0) < 𝛼
π‘“π‘Ÿπ‘’π‘ž. ≥ π‘π‘Ÿπ‘–π‘‘1 π‘œπ‘Ÿ π‘“π‘Ÿπ‘’π‘ž. ≤ π‘π‘Ÿπ‘–π‘‘2
𝑃(𝑋 ≥ π‘π‘Ÿπ‘–π‘‘1 |𝑝 = 𝑝0) < 𝛼/2
𝑃(𝑋 ≤ π‘π‘Ÿπ‘–π‘‘1 |𝑝 = 𝑝0) < 𝛼/2
13
√𝑝0 (1 − 𝑝0 )
√𝑛
𝑝 (1 − 𝑝 )
𝛽 = 𝑃 [𝑝̂ ≥ 𝑐|𝑝̂ ∼ 𝑁 (𝑝1 , 1 𝑛 1 )]
√𝑝0 (1 − 𝑝0 )
√𝑛
√𝑝0 (1 − 𝑝0 )
√𝑛
𝑝 (1 − 𝑝 )
𝛽 = 𝑃 [𝑝̂ ≤ 𝑐|𝑝̂ ∼ 𝑁 (𝑝1 , 1 𝑛 1 )]
⇔ π‘π‘ π‘‘π‘Žπ‘‘ < −𝑧𝛼
𝐻0 : 𝑝 = 𝑝0
𝐻1 : 𝑝 ≠ 𝑝0
𝜷 = P(Error Type 2); 𝝁 = 𝝁𝟏
𝑝 = 𝑝1 > 𝑝0 ;
⇔ π‘π‘ π‘‘π‘Žπ‘‘ > 𝑧𝛼
(𝐻0 true → 𝑍 ∼ 𝑁(0; 1))
{
{
𝑝̂ > 𝑝0 + 𝑧𝛼
P-value
𝑃(𝑍 > π‘π‘ π‘‘π‘Žπ‘‘ )
P-value
𝑃(𝑋 ≥ π‘“π‘Ÿπ‘’π‘ž.π‘ π‘‘π‘Žπ‘‘ )
𝑃(𝑋 ≤ π‘“π‘Ÿπ‘’π‘ž.π‘ π‘‘π‘Žπ‘‘ )
Binomal 𝑩(𝒏, 𝒑 = π’‘πŸŽ )
x
P(X = x)
0
𝐢𝑛0 𝑝00 (1 − 𝑝0 )𝑛
1
𝐢𝑛1 𝑝01 (1 − 𝑝0 )𝑛−1
…
…
𝑖
𝐢𝑛𝑖 𝑝0𝑖 (1 − 𝑝0 )𝑛−𝑖
…
…
𝑛−1
𝐢𝑛𝑛−1 𝑝0𝑛−1 (1 − 𝑝0 )1
𝑛
𝐢𝑛𝑛 𝑝0𝑛 (1 − 𝑝0 )0
Lecture07
Inference for 2 means
𝑋1 ∼ 𝑁(πœ‡1 , 𝜎12 ), 𝑋2 ∼ 𝑁(πœ‡2 , 𝜎22 )
π‘―πŸŽ : 𝝁 𝟏 = 𝝁 𝟐
true
Pair sample?
𝒅 = π‘ΏπŸ − π‘ΏπŸ ; 𝒕𝒆𝒔𝒕
Μ… , 𝒔𝒅
Sample: 𝒏, 𝒅
false
Hyp. pair
Statistic
Rejection region
Μ…
𝐻0 : πœ‡π‘‘ = 0
π‘‡π‘ π‘‘π‘Žπ‘‘ > 𝑑(𝑛−1)𝛼
𝑑−0
{
π‘‡π‘ π‘‘π‘Žπ‘‘ =
𝐻1 : πœ‡π‘‘ > 0
𝑠𝑑 /√𝑛
𝐻 :πœ‡ = 0
π‘‡π‘ π‘‘π‘Žπ‘‘ < −𝑑(𝑛−1)𝛼
{ 0 𝑑
𝐻1 : πœ‡π‘‘ < 0
𝐻 :πœ‡ = 0
|π‘‡π‘ π‘‘π‘Žπ‘‘ | > 𝑑(𝑛−1)𝛼/2
{ 0 𝑑
𝐻1 : πœ‡π‘‘ ≠ 0
𝑠
𝐻0 is false → C.I for πœ‡1 − πœ‡2 : 𝑑̅ ± 𝑑(𝑛−1)𝛼/2 𝑑𝑛
𝑃 − π‘£π‘Žπ‘™π‘’π‘’
𝑃[𝑇(𝑛 − 1) > π‘‡π‘ π‘‘π‘Žπ‘‘ ]
𝑃[𝑇(𝑛 − 1) < π‘‡π‘ π‘‘π‘Žπ‘‘ ]
2 × π‘ƒ[𝑇(𝑛 − 1) > |π‘‡π‘ π‘‘π‘Žπ‘‘ |]
√
Known 𝝈𝟐𝟏 , 𝝈𝟐𝟐 ?
true
Hyp. pair
𝐻0 : πœ‡1 = πœ‡2
{
𝐻1 : πœ‡1 > πœ‡2
𝐻1 : πœ‡1 < πœ‡2
𝐻1 : πœ‡1 ≠ πœ‡2
Test
Stat.
Μ…Μ…Μ…
𝑋1 − Μ…Μ…Μ…
𝑋2
𝑍=
𝜎2
√ 1
𝑛1
+
∼ 𝑁(0,1)
𝜎22
𝑛2
Reject Region
π‘π‘ π‘‘π‘Žπ‘‘ > 𝑧𝛼
P-value
𝑃(𝑍 > π‘π‘ π‘‘π‘Žπ‘‘ )
π‘π‘ π‘‘π‘Žπ‘‘ < −𝑧𝛼
|π‘π‘ π‘‘π‘Žπ‘‘ | > 𝑧𝛼/2
𝑃(𝑍 < π‘π‘ π‘‘π‘Žπ‘‘ )
2 × π‘ƒ(𝑍 > |π‘π‘ π‘‘π‘Žπ‘‘ |)
𝐻0 is false → C.I for πœ‡1 − πœ‡2 : …
false
Inference for 2 variances
Hyp. pair
𝐻 : 𝜎 2 = 𝜎22
{ 0 12
𝐻1 : 𝜎1 ≠ 𝜎22
𝐻 : 𝜎 2 = 𝜎22
{ 0 12
𝐻1 : 𝜎1 > 𝜎22
𝐻 : 𝜎 2 = 𝜎22
{ 0 12
𝐻1 : 𝜎1 < 𝜎22
Stat.
πΉπ‘ π‘‘π‘Žπ‘‘
𝐻0 is false → C.I for
𝑆12
= 2
𝑆2
𝝈𝟐𝟏 = 𝝈𝟐𝟐 ?
Reject Region
πΉπ‘ π‘‘π‘Žπ‘‘ > 𝑓(𝑛1 −1,𝑛2 −1)𝛼/2
or πΉπ‘ π‘‘π‘Žπ‘‘ < 𝑓(𝑛1 −1,𝑛2 −1)1−𝛼/2
πΉπ‘ π‘‘π‘Žπ‘‘ > 𝑓(𝑛1 −1,𝑛2 −1)𝛼
true
false
Hyp. pair
𝐻 : πœ‡ = πœ‡2
{ 0 1
𝐻1 : πœ‡1 > πœ‡2
𝐻1 : πœ‡1 < πœ‡2
𝐻1 : πœ‡1 ≠ πœ‡2
Reject Region
𝑇=
P
Μ…Μ…Μ…
𝑋1 − Μ…Μ…Μ…
𝑋2
π‘‡π‘ π‘‘π‘Žπ‘‘ > 𝑑(𝑛1 +𝑛2 −2)𝛼
𝑃(𝑇 > π‘‡π‘ π‘‘π‘Žπ‘‘ )
𝑆2 𝑆2
√ 𝑝+ 𝑝
𝑛1 𝑛2
π‘‡π‘ π‘‘π‘Žπ‘‘ < −𝑑(𝑛1 +𝑛2 −2)𝛼
|π‘‡π‘ π‘‘π‘Žπ‘‘ | > 𝑑(𝑛1 +𝑛2 −2)𝛼/2
𝑃(𝑇 < π‘‡π‘ π‘‘π‘Žπ‘‘ )
2 × π‘ƒ(𝑇 > |π‘‡π‘ π‘‘π‘Žπ‘‘ |)
∼ 𝑇(𝑛1 + 𝑛2 − 2)
(𝑛1 −
+ (𝑛2 − 1)𝑆22
𝑆𝑝2 =
𝑛1 + 𝑛2 − 2
1)𝑆12
Hyp. pair
𝐻 : πœ‡ = πœ‡2
{ 0 1
𝐻1 : πœ‡1 > πœ‡2
𝐻1 : πœ‡1 < πœ‡2
𝐻1 : πœ‡1 ≠ πœ‡2
πΉπ‘ π‘‘π‘Žπ‘‘ < 𝑓(𝑛1 −1,𝑛2 −1)1−𝛼
𝜎12
:
𝜎22
Stat.
…
Stat.
𝑇=
Μ…Μ…Μ…
𝑋1 − Μ…Μ…Μ…
𝑋2
𝑆2 𝑆2
√ 1+ 2
𝑛1 𝑛2
∼ 𝑇(𝑑𝑓)
𝑑𝑓 =
14
(𝑆12 /𝑛1 + 𝑆22 /𝑛2 )2
(𝑆12 /𝑛1 )2 (𝑆22 /𝑛2 )2
𝑛1 − 1 + 𝑛2 − 1
Reject region
π‘‡π‘ π‘‘π‘Žπ‘‘ > 𝑑(𝑑𝑓)𝛼
P-value
𝑃(𝑇 > π‘‡π‘ π‘‘π‘Žπ‘‘ )
π‘‡π‘ π‘‘π‘Žπ‘‘ < −𝑑(𝑑𝑓)𝛼
|π‘‡π‘ π‘‘π‘Žπ‘‘ | > 𝑑(𝑑𝑓)𝛼/2
𝑃(𝑇 < π‘‡π‘ π‘‘π‘Žπ‘‘ )
2 × π‘ƒ(𝑇 > |π‘‡π‘ π‘‘π‘Žπ‘‘ |)
Inference for 2 proportions
Hyp. pair
𝐻 : 𝑝 = 𝑝2
{ 0 1
𝐻1 : 𝑝1 > 𝑝2
𝐻1 : 𝑝1 < 𝑝2
𝐻1 : 𝑝1 ≠ 𝑝2
Stat.
𝑝̂1 − 𝑝̂ 2
𝑍=
1
1
√𝑝̅ (1 − 𝑝̅ ) ( + )
𝑛1 𝑛2
𝑛1 𝑝̂1 + 𝑛1 𝑝̂2
𝑝̅ =
𝑛1 + 𝑛2
𝑝̂1 (1−𝑝̂1 )
C.I: 𝑝1 − 𝑝2 ∈ (𝑝̂1 − 𝑝̂ 2 ) ± 𝑧𝛼 √
2
𝑛1
+
∼ 𝑁(0,1)
Reject Region
π‘π‘ π‘‘π‘Žπ‘‘ > 𝑧𝛼
P-value
𝑃(𝑍 > π‘π‘ π‘‘π‘Žπ‘‘ )
π‘π‘ π‘‘π‘Žπ‘‘ < −𝑧𝛼
|π‘π‘ π‘‘π‘Žπ‘‘ | > 𝑧𝛼/2
𝑃(𝑍 < π‘π‘ π‘‘π‘Žπ‘‘ )
2 × π‘ƒ(𝑍 > |π‘π‘ π‘‘π‘Žπ‘‘ |)
𝑝̂2 (1−𝑝̂2 )
𝑛2
Correlation test (formular and table)
15
Lecture 08: ANOVA: to test for equality of means
1. One - factor ANOVA
16
2. Two - factor ANOVA without interaction
17
3. Two - factor ANOVA with interaction
18
Lecture 09
1. Chi-squared test
2. Independent test (easy and common)
19
Can be proved:
20
2
πœ’π‘ π‘‘π‘Žπ‘‘
= 𝑛 ( ∑𝑖 ∑𝑗
2
𝐹𝑖𝑗
𝑅𝑖 𝐢𝑗
− 1)
3. Rank test
Critical value
21
4. Nomality test
Jarque-Bera test (easy and common)
22
Download