Harold's Statistics PDFs Cheat Sheet

advertisement
Harold’s Statistics
Probability Density Functions
“Cheat Sheet”
28 December 2015
PDF Selection Tree
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
1
Discrete Probability Density Functions
Probability Density
Function (PDF)
Mean
Standard Deviation
Uniform Discrete Distribution
𝑃(𝑋 = π‘₯) =
1
𝑏−π‘Ž+1
Conditions
Variables
TI-84
Example
Online PDF Calculator
πœ‡=
π‘Ž+𝑏
2
(𝑏 − π‘Ž)2
σ=√
12
ο‚· All outcomes are consecutive.
ο‚· All outcomes are equally likely.
ο‚· Not common in nature.
a = minimum
b = maximum
NA
NA
http://www.danielsoper.com/statcalc3/calc.aspx?id=102
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
2
Probability Density
Function (PDF)
Mean
Standard Deviation
Binomial Distribution
𝐡(π‘˜; 𝑛, 𝑝) =
𝒏
𝑷(𝑿 = π’Œ) = ( ) π’‘π’Œ (𝟏 − 𝒑)𝒏−π’Œ
π’Œ
where
𝑛
𝑛!
( ) = π‘›πΆπ‘˜ =
(𝑛 − π‘˜)! π‘˜!
π‘˜
1 (π‘˜−𝑛𝑝)2
1
1
−
𝑃(𝑋 = π‘˜) ≈
βˆ™
βˆ™ 𝑒 2 π‘›π‘π‘ž
π‘›π‘π‘ž
√2πœ‹
√
Conditions
Variables
TI-84
Example
Online PDF Calculator
𝝁𝒙 = 𝒏𝒑
𝝁𝒑̂ = 𝒑
𝛔𝒙 = √𝒏𝒑(𝟏 − 𝒑) = √π‘›π‘π‘ž
𝑛𝑝 ≥ 10 𝐚𝐧𝐝 π‘›π‘ž ≥ 10
𝛔𝒑̂ = √
𝒑(𝟏 − 𝒑)
π‘π‘ž
=√
𝒏
𝑛
Use for large 𝑛 (> 15) to approximate binomial distribution.
ο‚· n is fixed.
ο‚· The probabilities of success (𝑝) and failure (π‘ž) are constant.
ο‚· Each trial is independent.
n = fixed number of trials
p = probability that the designated event occurs on a given trial
(Symmetric if p = 0.5)
X = Total number of times the event occurs (0 ≤ 𝑋 ≤ 𝑛)
For one x value:
[2nd] [DISTR] A:binompdf(n,p,x)
For a range of x values [j,k]:
[2nd] [DISTR] A:binompdf( [ENTER] n, p, [↓] [↓] [ENTER] [STO>] [2nd]
[3] (=L3) [ENTER]
[2nd] [LIST] [→→ MATH] 5:sum(L3,j+1,k+1)
Larry’s batting average is .260. If he’s at bat four times, what is the
probability that he gets exactly two hits?
Solution:
n = 4, p = 0.26, x = 2
binompdf(4,.26,2) = 0.2221 = 22.2%
http://stattrek.com/online-calculator/binomial.aspx
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
3
Probability Density
Function (PDF)
Mean
Standard Deviation
Geometric Distribution
𝑃(𝑋 ≤ π‘₯) = π‘ž π‘₯−1 𝑝 = (1 − 𝑝)π‘₯−1 𝑝
𝑃(𝑋 > π‘₯) = π‘ž π‘₯ = (1 − 𝑝)π‘₯
πœ‡ = 𝐸(𝑋) =
𝟏
𝒑
𝝈=
𝟏−𝒑
√π‘ž
=√ 𝟐
𝑝
𝒑
ο‚· A series of independent trials with the same probability of a given
event.
ο‚· Probability that it takes a specific amount of trials to get a success.
Conditions
Variables
TI-84
Can answer two questions.
a) Probability of getting 1st success on the π‘›π‘‘β„Ž trial
b) Probability of getting success on ≤ 𝑛 trials
Since we only count trials until the event occurs the first time, there is
no need to count the 𝑛𝐢π‘₯ arrangements, as in the binomial
distribution.
p = probability that the event occurs on a given trial
X = # of trials until the event occurs the 1st time
[2nd] [DISTR] E:geometricpdf(p, x)
Suppose that a car with a bad starter can be started 90% of the time
by turning on the ignition. What is the probability that it will take
three tries to get the car started?
Example
Online PDF Calculator
Solution:
p = 0.90, X = 3
geometpdf(0.9, 3) = 0.009 = 0.9%
http://www.calcul.com/show/calculator/geometric-distribution
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
4
Probability Density
Function (PDF)
Mean
Standard Deviation
Poisson Distribution
𝑃(𝑋 = π‘₯) =
πœ†π‘₯ 𝑒 −πœ†
, π‘₯ = 0,1,2,3,4, …
π‘₯!
Conditions
Variables
TI-84
πœ‡ = 𝐸(𝑋) = πœ†
𝜎 = √πœ†
ο‚· Events occur independently, at some average rate per interval of
time/space.
πœ† = average rate
X = total number of times the event occurs
There is no upper limit on X.
[2nd] [DISTR] C:poissonpdf(πœ†, 𝑋)
Suppose that a household receives, on the average, 9.5 telemarketing
calls per week. We want to find the probability that the household
receives 6 calls this week.
Example
Online PDF Calculator
Bernoulli
tnomial
Hypergeometric
Negative Binomial
Solution:
πœ† = 9.5, X = 6
poissonpdf(9.5, 6) = 0.0764 = 7.64%
http://stattrek.com/online-calculator/poisson.aspx
See
http://www4.ncsu.edu/~swu6/documents/A-probability-andstatistics-cheatsheet.pdf
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
5
Continuous Probability Density Functions
Probability Density
Function (PDF)
Mean
Standard Deviation
πœ‡=πœ‡
𝜎=𝜎
πœ‡=0
𝜎=1
Normal Distribution
(Gaussian) / Bell Curve
𝒩(π‘₯; πœ‡, 𝜎 2 ) =
𝛷(π‘₯) =
1
𝜎√2πœ‹
𝑒
1 π‘₯−πœ‡ 2
− (
)
2 𝜎
Special Case: Standard Normal
𝒩(π‘₯; 0, 1)
Conditions
Variables
TI-84
ο‚· Symmetric, unbounded, bell-shaped.
ο‚· No data is perfectly normal - instead, a distribution is approx.
normal.
πœ‡ = mean
𝜎 = standard deviation
x = observed value
Have boundaries, need area:
z-scores: [2nd] [DISTR] 2:normalcdf(left-boundary, right boundary)
x-scores: [2nd] [DISTR] 2:normalcdf(left-boundary, right boundary, πœ‡,
𝜎)
Have area, need boundary:
z-scores: [2nd] [DISTR] 3:invNorm(area to left)
x-scores: [2nd] [DISTR] 3:invNorm(area to left, πœ‡, 𝜎)
Suppose the mean score on the math SAT is 500 and the standard
deviation is 100. What proportion of test takers earn a score between
650 and 700?
Example
Online PDF Calculator
Solution:
left-boundary = 650, right boundary =700, μ = 500, σ = 100
normalcdf(650, 700, 500, 100) = 0.0441 = ~4.4%
http://davidmlane.com/normal.html
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
6
Standard Normal Distribution Table: Positive Values (Right Tail) Only
Z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.0
4.1
+0.00
0.50000
0.53980
0.57930
0.61791
0.65542
0.69146
0.72575
0.75804
0.78814
0.81594
0.84134
0.86433
0.88493
0.90320
0.91924
0.93319
0.94520
0.95543
0.96407
0.97128
0.97725
0.98214
0.98610
0.98928
0.99180
0.99379
0.99534
0.99653
0.99744
0.99813
0.99865
0.99903
0.99931
0.99952
0.99966
0.99977
0.99984
0.99989
0.99993
0.99995
0.99997
0.99998
+0.01
0.50399
0.54380
0.58317
0.62172
0.65910
0.69497
0.72907
0.76115
0.79103
0.81859
0.84375
0.86650
0.88686
0.90490
0.92073
0.93448
0.94630
0.95637
0.96485
0.97193
0.97778
0.98257
0.98645
0.98956
0.99202
0.99396
0.99547
0.99664
0.99752
0.99819
0.99869
0.99906
0.99934
0.99953
0.99968
0.99978
0.99985
0.99990
0.99993
0.99995
0.99997
0.99998
+0.02
0.50798
0.54776
0.58706
0.62552
0.66276
0.69847
0.73237
0.76424
0.79389
0.82121
0.84614
0.86864
0.88877
0.90658
0.92220
0.93574
0.94738
0.95728
0.96562
0.97257
0.97831
0.98300
0.98679
0.98983
0.99224
0.99413
0.99560
0.99674
0.99760
0.99825
0.99874
0.99910
0.99936
0.99955
0.99969
0.99978
0.99985
0.99990
0.99993
0.99996
0.99997
0.99998
+0.03
0.51197
0.55172
0.59095
0.62930
0.66640
0.70194
0.73565
0.76730
0.79673
0.82381
0.84849
0.87076
0.89065
0.90824
0.92364
0.93699
0.94845
0.95818
0.96638
0.97320
0.97882
0.98341
0.98713
0.99010
0.99245
0.99430
0.99573
0.99683
0.99767
0.99831
0.99878
0.99913
0.99938
0.99957
0.99970
0.99979
0.99986
0.99990
0.99994
0.99996
0.99997
0.99998
+0.04
0.51595
0.55567
0.59483
0.63307
0.67003
0.70540
0.73891
0.77035
0.79955
0.82639
0.85083
0.87286
0.89251
0.90988
0.92507
0.93822
0.94950
0.95907
0.96712
0.97381
0.97932
0.98382
0.98745
0.99036
0.99266
0.99446
0.99585
0.99693
0.99774
0.99836
0.99882
0.99916
0.99940
0.99958
0.99971
0.99980
0.99986
0.99991
0.99994
0.99996
0.99997
0.99998
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
+0.05
0.51994
0.55966
0.59871
0.63683
0.67364
0.70884
0.74215
0.77337
0.80234
0.82894
0.85314
0.87493
0.89435
0.91149
0.92647
0.93943
0.95053
0.95994
0.96784
0.97441
0.97982
0.98422
0.98778
0.99061
0.99286
0.99461
0.99598
0.99702
0.99781
0.99841
0.99886
0.99918
0.99942
0.99960
0.99972
0.99981
0.99987
0.99991
0.99994
0.99996
0.99997
0.99998
+0.06
0.52392
0.56360
0.60257
0.64058
0.67724
0.71226
0.74537
0.77637
0.80511
0.83147
0.85543
0.87698
0.89617
0.91308
0.92785
0.94062
0.95154
0.96080
0.96856
0.97500
0.98030
0.98461
0.98809
0.99086
0.99305
0.99477
0.99609
0.99711
0.99788
0.99846
0.99889
0.99921
0.99944
0.99961
0.99973
0.99981
0.99987
0.99992
0.99994
0.99996
0.99998
0.99998
+0.07
0.52790
0.56749
0.60642
0.64431
0.68082
0.71566
0.74857
0.77935
0.80785
0.83398
0.85769
0.87900
0.89796
0.91466
0.92922
0.94179
0.95254
0.96164
0.96926
0.97558
0.98077
0.98500
0.98840
0.99111
0.99324
0.99492
0.99621
0.99720
0.99795
0.99851
0.99893
0.99924
0.99946
0.99962
0.99974
0.99982
0.99988
0.99992
0.99995
0.99996
0.99998
0.99998
+0.08
0.53188
0.57142
0.61026
0.64803
0.68439
0.71904
0.75175
0.78230
0.81057
0.83646
0.85993
0.88100
0.89973
0.91621
0.93056
0.94295
0.95352
0.96246
0.96995
0.97615
0.98124
0.98537
0.98870
0.99134
0.99343
0.99506
0.99632
0.99728
0.99801
0.99856
0.99896
0.99926
0.99948
0.99964
0.99975
0.99983
0.99988
0.99992
0.99995
0.99997
0.99998
0.99999
+0.09
0.53586
0.57535
0.61409
0.65173
0.68793
0.72240
0.75490
0.78524
0.81327
0.83891
0.86214
0.88298
0.90147
0.91774
0.93189
0.94408
0.95449
0.96327
0.97062
0.97670
0.98169
0.98574
0.98899
0.99158
0.99361
0.99520
0.99643
0.99736
0.99807
0.99861
0.99900
0.99929
0.99950
0.99965
0.99976
0.99983
0.99989
0.99992
0.99995
0.99997
0.99998
0.99999
7
Probability Density
Function (PDF)
Mean
Standard Deviation
Student’s t Distribution
This distribution was first studied by William Gosset, who published
under the pseudonym Student.
𝜈 = df = degrees of freedom = 𝑛 − 1
Degrees of Freedom
A positive whole number that indicates the lack of restrictions in our
calculations. The number of values in a calculation that we can vary.
e.g. 𝜈 = 𝑑𝑓 = 1 means 1 equation 2 unknowns
lim 𝑑𝑝𝑑𝑓(π‘₯, 𝜈) = π‘›π‘œπ‘Ÿπ‘šπ‘Žπ‘™π‘π‘‘π‘“(π‘₯)
𝜈→∞
−(𝜈+1)
𝜈+1
2
2
𝛀(
)
π‘₯
2
𝑃(𝜈) =
+
(1
)
𝜈
𝜈
√πœˆπœ‹ 𝛀 (2)
Where the Gamma function
∞
𝛀(𝑠) = ∫ 𝑑 𝑠−1 𝑒 −𝑑 𝑑𝑑
πœ‡ = 𝐸(π‘₯) = 0 (always)
𝜎=0
0
𝛀(𝑛) = (𝑛 − 1)!
1
𝛀 ( ) = √πœ‹
2
Conditions
Variables
TI-84
Example
Online PDF Calculator
ο‚· Used for inference about means.
ο‚· Are typically used with small sample sizes or when the population
standard deviation isn’t known.
ο‚· Similar in shape to normal.
x = observed value
𝜈 = df = degrees of freedom = 𝑛 − 1
[2nd] [DISTR] 5:tpdf(x, 𝜈)
tpdf() is only used to graph the function.
http://stattrek.com/online-calculator/t-distribution.aspx
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
8
Probability Density
Function (PDF)
Mean
Standard Deviation
Chi-Square Distribution
πœ’ 2 (π‘₯, π‘˜) =
1
π‘˜
22
π‘˜
𝛀( )
2
π‘˜
π‘₯ 2 −1 𝑒
Conditions
Variables
TI-84
Example
Online PDF Calculator
Uniform
Log-Normal
Multivariate Normal
F
Exponential
Gamma
Inverse Gamma
Dirichlet
Beta
Weibull
Pareto
−π‘₯
2
Skewed-right (above) have fewer values to the right, and median <
mean.
πœ‡ = 𝐸(𝑋) = π‘˜
𝜎 2 = √2π‘˜
π‘˜+1
( 2 )
π‘˜+1 2
πœ‡ = √2𝛀
2𝛀
(
)
π‘˜
2
𝛀 (2 )
𝜎2 = π‘˜ −
π‘˜ 2
𝛀 (2 )
Mode = √π‘˜ − 1
ο‚· Used for inference about categorical distributions.
ο‚· Used when we want to test the independence, homogeneity, and
"goodness of fit” to a distribution.
ο‚· Used for counted data.
x = observed value
𝜈 = df = degrees of freedom = 𝑛 − 1
[2nd] [DISTR] 7: πœ’ 2 pdf(x, 𝜈)
πœ’ 2 pdf() is only used to graph the function.
http://calculator.tutorvista.com/chi-square-calculator.html
See
http://www4.ncsu.edu/~swu6/documents/A-probability-andstatistics-cheatsheet.pdf
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
9
Continuous Probability Distribution Functions
Cumulative Distribution
Function (CDF)
Mean
Standard Deviation
π‘₯
𝑃(𝑋 ≤ π‘₯) = ∫ 𝑓(π‘₯) 𝑑π‘₯
∞
If 𝑓(π‘₯) = 𝛷(π‘₯) (the Normal PDF), then no exact solution is known.
Use z-tables or web calculator (http://davidmlane.com/normal.html).
−∞
∫ 𝑓(π‘₯) 𝑑π‘₯ = 1
−∞
Integral of PDF = CDF (Distribution)
The area under the curve is always equal to exactly 1 (100%
probability).
π‘₯
𝐹(π‘₯) = ∫ 𝑓(π‘₯) 𝑑π‘₯
−∞
Derivative of CDF = PDF (Density)
𝑓(π‘₯) =
𝑏
Expected Value
𝑑𝐹(π‘₯)
𝑑π‘₯
Use the density function 𝑓(π‘₯), not
the distribution function 𝐹(π‘₯), to
calculate 𝐸(𝑋), π‘‰π‘Žπ‘Ÿ(𝑋) and 𝜎(𝑋).
𝐸(𝑋) = ∫ π‘₯ 𝑓(π‘₯) 𝑑π‘₯
π‘Ž
𝑏
Needed to calculate Variance
𝐸(𝑋 2 ) = ∫ π‘₯ 2 𝑓(π‘₯) 𝑑π‘₯
π‘Ž
Variance
Standard Deviation
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
π‘‰π‘Žπ‘Ÿ(𝑋) = 𝐸(𝑋 2 ) − 𝐸(𝑋)2
𝜎(𝑋) = √π‘‰π‘Žπ‘Ÿ(𝑋)
10
Discrete Distributions
http://www4.ncsu.edu/~swu6/documents/A-probability-and-statistics-cheatsheet.pdf
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
11
Continuous Distributions
Copyright © 2015 by Harold Toomey, WyzAnt Tutor
12
Download
Study collections