Uploaded by jovanleeyh

ST2334 Summary Notes

advertisement
Chapter 3: Joint Distributions
Properties
Range Space of (X,Y)
Discrete Two Dimensional RV: if the number of possible values of (X(s),Y(s))
are finite or countable
Continuous Two Dimensional RV: if the possible values of (X(s),Y(s)) can
assume any value in some region of the Euclidean space R2. E.g. Time
Marginal Probability Distribution
Discrete RV
Joint Probability Function
Discrete RV
fX,Y(x, y)=P(X =x, Y =y)
Continuous RV
From f(x,y)
To get f(x), integrate y away (dy)
To get f(y), integrate x away (dx)
Properties
Conditional Probability Function
Continuous RV
Independent Random Variable
fX,Y(x,y)= fX(x)fY(y) & >0
If & only if f(x) AND f(y) are both POSITIVE
Conclusion: if RX ,Y is not a product space, then X and Y are not independent!
Expectation For any two variable function g(x,y)
Discrete RV
Examples
DISCRETE: Find E(Y-X) & Cov(X,Y)
Method 1:
Continuous RV
Rmb: For E(XY) we need to multiple by xy e.g. int xy f(x,y)
For E(X) we need to multiple by x!
Covariance
General
Method 2:
E(Y −X) = E(Y)−E(X) = 1.5−0.5 = 1
E(Y) = 0·(1/8)+1·(3/8)+2·(3/8)+3·(1/8) = 1.5
E(X) = 0·(1/2)+1·(1/2) = 0.5
Cov: E(XY) = (0)(0)(1/8)+(0)(1)(1/4)+(0)(2)(1/8) +...+(1)(3)(1/8) = 1.
cov(X,Y) = E(XY)−E(X)E(Y) = 1−(0.5)(1.5) = 0.25
CONTINUOUS: Find the cov(X,Y) given pf:
Given pf:
Cov(X,Y) = E(XY) – E(X)E(Y)
Discrete RV
Continuous RV
Properties
(1) cov(X,Y)=E(XY)−E(X)E(Y).
(2) If X and Y are independent, then cov(X,Y) = 0. However, cov(X,Y) = 0
does not imply that X and Y are independent.
(3) cov(aX +b,cY +d) = ac·cov(X,Y).
(4) V(aX +bY) = a2V(X)+b2V(Y)+2ab·cov(X,Y).
V(X+Y) = V(X) + V(Y) – 2(Cov X,Y)  If X,Y are indept, Cov = 0!
Chapter 4: Special Probability Distribution
Discrete Uniform Distribution
If RV X assumes the values x1,x2,...,xk with equal probability, then X follows a
discrete uniform distribution.
p.f.
Mean
Variance
II) Negative Binomial Distribution
X = number of trials until the kth success occurs. X ∼ NB(k, p), where p is
probability of success.
p.f.
Mean & Variance
E(X) = k/p
V(X) = (1− p)k/p2.
for x = k,k+1,k+2,...
For x = x1, x2, … xk
And 0 otherwise
Can also use
E(X2) – (E(X))2
Bernoulli Trial
random experiment with only two possible outcomes; success or failure
p.f.
Mean & Variance
When n = 1, the pf for binom RV X:
fX(x)=px(1−p)1−x, for x = 0,1.
Bernoulli Process consists of a sequence of repeatedly performed
independent and identical Bernoulli trials
 Distributions derived from Bernoulli trial & process (1) Binomial
Distribution (2) Negative Binomial Distribution (Geometric Distribution (3)
Poisson Distribution
I) Binomial Distribution
Binomial random variable counts the number of successes in n trials in a
Bernoulli Process. X~B(n, p) where p is probability of success.
The probability of getting exactly x successes is given as:
Geometric Distribution: Special Case of NB
X = number of i.i.d. Bernoulli(p) trials until the first success occurs; then X
follows a geometric distribution, denote by X ∼ G(p)
p.f.
Mean & Variance
E(X) = 1/p
V(X) = (1-p)/p2
III) Poisson Distribution
X denotes the number of events occurring in a fixed period of time or fixed
region. X ∼ Poisson(λ) where λ is the expected number of occurrences
during the given period/region.
p.f.
Mean & Variance
E(X) = λ
V(X) = λ
Poisson Process: Continuous time process, expected number of occurrences
in an interval of length T is αT. Follow Poisson(αT) distribution
Poisson Approx of Binomial Distribution
Let X ∼B(n,p). Suppose that n→∞ and p→0 in such a way that λ =np
remains a constant. Then approximately, X ∼ Poisson(np)
Approx is good when n≥20and p≤0.05, or if n≥100 and np≤ 10.
E(X) = np | V(X) = np(1-p)
Continuous Distribution
Continuous Uniform Distribution X ∼ U(a,b)
Random variable X is said to follow a uniform distribution over the interval
(a, b) if its probability density function is given by:
p.f.
Mean & Variance
E(X) = (a+b) / 2
V(X) = (b-a)2/12
“No-Memory” of Exp
Suppose that X has an exponential distribution with parameter λ > 0. Then
for any two positive numbers s and t, we have
P(X >s+t|X >s) = P(X >t)
Normal Distribution X ∼ N(μ,σ2)
RV X is said to follow a normal distribution with parameters μ and σ2 if its
probability density function is given by
p.f.
Mean & Variance
E(X) = μ
V(X) = σ2
Quantile
The αth (upper) quantile (0 ≤
α ≤ 1) of the RV X is the
number xα that satisfies:
P(X ≥xα)=α.
Exponential Distribution X~Exp(λ)
RV X is said to follow an exponential distribution with parameter λ > 0 if its
p.d.f. is given by
p.f.
Mean & Variance
E(X) = 1/ λ
V(X) = 1/ λ2
Alternative p.f.
Aka if E(T) = 5, λ = 1/5
Mean & Variance
E(X) = μ
V(X) = μ 2
We denote by zα the αth (upper) quantile
(or 100α percentage point) of Z ∼ N(0,1)
P(Z ≥ zα ) = α .
zα is value on x asis where area = α
e.g. z0.05 = 1.645, z0.01 = 2.326
Normal Approx To Binomial Distribution
X ∼B(n,p), so that E(X) =np and V(X)=np(1−p). Then as n→∞, but p
remains constant, we can use normal dist to approx. the binom distri.
Rule of thumb, use normal when: np>5 and n(1−p)>5.
Recall: when n → ∞, p → 0, and np remains a constant, we can use Poisson
distribution to approximate the binomial distribution.
Chapter 5: Sampling & Sampling Distribution
Population: The totality of all possible outcomes or
observations of a survey or experiment
- Finite Population: Finite no of elements
- Infinite Pop: consists of an infinitely (countable
and un- countable) large number of elements
e.g. he results of all possible rolls of a pair of
dice; the depths at all conceivable positions of a
lake
Sample Solving
Other Distributions (χ2, t, and F)
Examples of distributions that are derived from
random samples from a normal distribution.
χ2 Distribution
Simple Random Sample
If Z is a standard norm variable, a random variable
of n members is a sample that is chosen such that
with the same distribution as Z2 is called a χ2
every subset of n observations of the population has random variable with one degree of freedom.
the same probability of being selected.
Sample Mean
Sample Variance
χ2 random variable with n deg of freedom as χ2(n)
Y ∼ χ2(n)
P(Y > χ2(n;α)) = α.
If S2 is the variance of a random sample of size
n taken from a normal population having the
variance σ2, then the random variable:
has a χ2 distribution with n − 1 deg of freedom
Mean & Variance of X̄
Example: 6 random samples are drawn from a
N(μ,4) population. Define the sample variance:
Standard Error: SD of a sample, denoted by σX
Instead of σ which is for population
Central Limit Theorm


E(Y) = deg of freedom, Var(Y) = 2*deg of
freedom
For large n, χ2(n) is approximately N(n,2n)
Find c such that P(S2 > c) = 0.05.
T Distribution
Suppose Z ∼ N(0,1) and U ∼ χ2(n). If Z and U are
independent, then
F Distribution
Suppose U ∼ χ2(m) and V ∼ χ2(n) are
independent. Then the distribution of the
random variable
follows the t-distribution with n deg of freedom.

Find the probability that a random sample of 25
observations, from a normal population with
variance σ 2 = 6, will have a variance S2
(a) greater than 9.1.

The t-distribution approaches N(0,1) as the
parameter n → ∞. When n ≥ 30, we can
replace it by N(0,1).
T ∼t(n), thenE(T) = 0 and var(T) = n/(n−2) for
n>2
is called a F-distribution with (m,n) deg of
freedom.

For X~F(m,n), E(X) = n/(n-2) for n >2

T ∼ t(n)
P(T >tn;α)=α
n − 1 degrees of
freedom
F(m,n;α)
P(F > F(n,m;α)) = α
If S12 and S2 represent the variances of
independent random samples of size n1 = 8
and n2 = 12, taken from normal populations
with equal variances, find
pf(4.89, 7,11) = 0.9900295
Chapter 6: Estimation
Two types of statistical inference methods
1. Estimation of population parameters (Chap 6)
2. Testing hypotheses about the parameter values (Chap 7)
Unbiased Estimator
When E(θ) = θ where θ can be p,μ, or σ.
is an unbiased estimator of σ2 since E(S2) = σ2.
Summary of Test Statistic, Max Error Of Estimate & Sample Size
We say that X±E has probability (1−α) of containing μ.
Comparing Two Populations
Independent, Known & Unequal Var
- Pop var known & NOT equal
- Two populations are normally
distributed or both n >= 30
Independent, Small & Equal Var
- Pop var unknown & the same
- Both n < 30 & norm distri.
Confidence Interval
This “fairly certain” can be quantified by the degree of confidence
also known as confidence level (1 − α ), in the sense that : P(a < μ <
b) = 1 − α .
(a, b) is called the (1 − α ) confidence interval.
Where t: w n1+n2 – 2 deg of freedom
Independent, Large WIth
Unknown Variances
- Pop var unknown & NOT Equal
- Both n >= 30
Independent, Large & Equal Var
- Pop var unknown & the same
- Both n >= 30 & norm distri.
Paired Data
Rejection Region
H1 : μ =/= μ0
z <− zα/2 or z > zα/2.
t < −tn−1,α/2 or t > tn−1,α/2
Rejection Region Using p Value
• If p-value < α, reject H0; else
• If p-value ≥ α, do not reject H0.
Where d = difference
Chapter 7: Hypothesis Testing
Both null & alternative hypothesis are statements about a population
 In this chapter it will be about the mean of a population
H1 : μ < μ0
z < −zα.
t < −tn−1,α .
H1 : μ > μ0
z > z α.
t > tn−1,α
Tests Comparing Mean: Independent Samples
H0 : μ1 − μ2 = δ0
Known Population Var
Unknown Population Var
- Normal distribution OR
- N1 & N2 >= 30
- N1 & N2 >= 30
Test Statistic
Test Statistic
Type I vs Type II Error
The rejection of H0 when H0 is true is called a Type I error
Not rejecting H0 when H0 is false is called a Type II error
-
The probability of making a Type I error is called the level of
significance, denoted by α
We define 1−β = P(Reject H0 | H0 is F) to be the power of the test.
Test Statistic
Known Variance
- Pop variance is known
- N sufficiently large, >= 30
Unknown Variance
- Pop variance unknown
- Distribution normal
Rejection Region
Unknown But Equal
Pop variances unknown but equal; normal dist; n1 & n2 <30
where √
𝑆𝐷1 2 +𝑆𝐷2 2
2
test against t (n-2)
Paired Data
Di = Xi −Yi
For the null hypothesis H0 : μD = μD0
For the null hypothesis H0 : μD = μD0
Test Statistic
If n < 30 & pop is normally
distributed: T ∼tn−1.
If n >= 30, T ∼N(0,1)
Pop mean diff is usually 0!
Using R Functions
Revision Questions
Joint Distributions
Tutorial 5
Q6
Tutorial 6
Q1
Suppose that X and Y are RV having the joint probability function.
a) Find E(Y|X=2)
Thus E(Y|X = 2) = 1(1/4)+3(2/4)+5(1/4) = 3.
Alternatively, since X and Y are independent, E (Y |X = 2) = E (Y ) = 1(0.25) +
3(0.5) + 5(0.25) = 3.
b) Find E(XY)
E(XY)=(2)(1)(0.10)+(2)(3)(0.10)+(2)(5)(0.10)
+(4)(1)(0.10)+(4)(3)(0.10)+(4)(5)(0.10)= 9.6.
Q2


X & Y independent? Find f(x) by integrating over y, f(y) and multiply!
Find Cov(X,Y) & V(X+Y)
A fast food restaurant operates a drive-up facility and a walk-up window. On
a randomly selected day, let X = proportion of time that the drive-up facility
is in use (at least one customer is being served or waiting to be served) and Y
= the proportion of the time that the walk-up window is in use. Suppose that
the joint probability density function of (X,Y) is given by
(iii) Given that the drive-up facility is busy 80% of the time, what is the
probability that the walk-in facility is busy at most half the time?
(iv) Given that the drive-up facility is busy 80% of the time, what is the
expected proportion of time that the walk-in facility is busy?
Sampling Probability Distribution
Tutorial 7
1. A box contains 2 red marbles and 98 blue ones. Draws are made at
random with replacement. In n draws from the box, there is better than a
50% chance for a red marble to appear at least once. What is the smallest
possible value for n?
Q7: A company rents time on a computer for periods of t hours, for which it
receives $600 an hour. The number of times the computer breaks down
during t hours is a random variable having the Poisson distribution with λ =
0.8t, and if the computer breaks down x times during t hours, it costs 50x2
dollars to fix it. How should the company select t in order to maximize its
expected profit?
P(X = 0) = (1−0.02)n ≤ 0.5
n ≥ log(0.5)/ log(0.98) = 34.31
n = 35
Using trial and error: pbinom(0, 35, 0.02, lower.tail=FALSE) = 0.5069254. n =
35
Q4
Three people toss a fair coin and the odd man pays for coffee. If the coins all
turn up the same, they are tossed again. Find the probability that fewer than
4 tosses are needed.
P(failure) = P(HHH) + P(TTT) = 1/2*1/2*1/2*2 = ¼
P(X<4)=3/4+(1/4)(3/4)+(1/4)^2(3/4)=63/64.
6. A notice is sent to all owners of a certain type of automobile, asking them
to bring their cars to a dealer to check for the presence of a particular type
of defect. Suppose that only 0.05% of the cars have the defect. Consider a
random sample of 10,000 cars.
(b) What is the (approximate) probability that at least 10 sampled cars have
the defect?
Use Poisson Approx since n is large and p is small
ppois(9,5, lower.tail = F)
Tutorial 8
3. The time (in hours) required to repair a machine is an exponentially
distributed random variable with parameter λ = 1/2. What is the conditional
probability that a repair takes at least 10 hours, given that its duration
exceeds 9 hours?
X~Exp(1/2)
P(X>=10 | X >9) which equals P(T > 1) by the memoryless property of the
exponential distribution. pexp(1, 1/2, lower.tail = F) = 0.6065307
Q8: A coin is tossed 400 times, Use the normal approximation to find the
probability of obtaining between 185 and 210 heads inclusive.
Y = number of head in 400 tosses of a coin
𝑌~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑛=400, 𝑝=0.5)
𝐸(𝑌) = 𝑛𝑝 = 400(0.5) = 200. 𝑉(𝑌) = 𝑛𝑝(1 − 𝑝) = 400(0.5)(0.5) = 100
Y~N(200, 10^2)
Estimation
Tutorial 9
Q6: Let X be a binomial random variable with parameters n and p.
Hypothesis Testing
Q4: A normal population with unknown variance has a mean of 20. Is one
likely to obtain a random sample of size 9 from this population with a
standard deviation being 4.1 and a mean being larger than or equal to 24? If
not, what conclusion would you draw?
(a) E(U) = E(X)/n = np/n = p. Since E(U) = p, therefore U is an unbiased
estimator of p.
Since E(V)= ̸= p, therefore V is a biased estimator of p!
So we conclude that μ > 20, since this probability is very small showing that
it is very unlikely to get mean of 24 if the population mean is really 20.
Tutorial 10
Suppose we wish to test the hypothesis H0 : μ = 2 vs H1 : μ ̸= 2
and found a two-sided p-value of 0.03. Separately, a 95% confidence
interval for μ is computed to be (1.5, 4.0). Are these two results
compatible? Why or why not?
SOLUTION
No; a p-value of 0.03 suggests that we will reject the null hypothesis at 0.05
level. On the other hand, if the 95% CI contains the null value of 2, then we
should not reject the null hypothesis at 0.05 level. So these two statements
are not compatible.
Other Questions
Population of fish; Mean 54, SD 4.5mm
A random sample of four fish is chosen from the population. Find the
probability that all four fish are between 51 and 60 mm long.
Continuing from the previous part, find the probability that the mean length
of the four fish in the sample is between 51 and 60 mm long.
Must divide sqrt n here!
Sample mean need to divide
Download