Statistical Interval for a Single Sample

advertisement
Test of Hypotheses: Two Sample.
Outlines:
 Inference on the difference in means of two normal
distributions, variance known
 Inference on the difference in means of two normal
distributions, variance unknown
 Paired t-test
 Inference on the variances of two normal distributions
 Inference on the two population proportions
Hypothesis testing



Engineers and scientist are often interested in comparing two
difference conditions to determine whether either conditions
produce a significant effect on the response.
Condition => Treatment
Cause and effect relationship: the difference in treatments
resulted in the difference in response.
Case I



Inference on the difference in means of two normal
distributions, variance known
Hypothesis
Test Statistic
H 0 : 1   2   0
H1 : 1   2   0
Z0 
X1  X 2  0
 12
n1


We should reject H0 if
 22
n2
Z 0  Z / 2
or
Z 0   Z / 2
Case I


Ex. A product developer is interested in reducing the drying time of a primer paint. Two
formulations (old, new) of the paint are tested. The sd of drying time is 8 mins. Ten specimens
are paint with formulation 1, and another 10 specimens are painted with formulation 2; the
20 specimens are paint in random order. The two sample average drying times are x1  121, x2  112
What conclusion can be drawn about the effectiveness of the new ingredient, using =0.05
1.
Parameter of interest is the difference in mean drying time, 1  2
2.
H 0 : 1  2  0, H1 : 1  2
3.
=0.05
4.
Test statistic
Z0 
X1  X 2  0

2
1
n1


2
2
,
n2
5.
reject H0 if Z 0  Z / 2  1.645
6.
Calculate Z0 = 2.52
7.
 0  0,  1   2  8, n1  n2  10, x1  121, x2  112
Conclusion : since Z 0  Z / 2 , we reject H0 at 0.05 significance level. Adding new ingredient
to the paint significantly reduces the drying time.
Case I

Sample size

Using Operating Characteristic Curve (OC Curve)
|   2   0 | |    0 |
d 1

2
2
n=n1=n2
1   2
 12   22
If the value of n1  n2 , we can use the formula to calculate the value of n1 when n2 is
fixed,
2
2
n
1   2
 12 / n1   22 / n2
Case I

Sample Size formulas
Case I

Confidence Interval
The error in estimating µ1-µ2 by x1  x2 will be less than E at 100(1- )% confidence. The
required sample size from each population is
Case I

Ex. Tensile strength tests were performed on two different grades of
aluminum spars. From past experience with the spar manufacturing process
and testing procedure, the standard deviations of the tensile strengths are
assumed to be known. The data obtained are as follows:
n1  10, x1  87.6,1  1, n2  12, x2  74.5, 2  1.5

Find a 90% confidence interval on the difference in mean strength µ1-µ2
Case I.I

We can use the concept of Case I for the cases that
we don’t know exactly about the population
distribution (may be not normal distribution) and the
number of sample size are large. (n1, n2 >=40)
Case II.I



Inference on the difference in means of two normal
distributions, variance unknown
2
2
Case 1:  1   2  
Hypothesis
H 0 : 1   2   0
H1 : 1   2   0


Test Statistic
X  X 2  0
T0  1
1 1
Sp

n1 n2
We should reject H0 if
(n1  1) S12  (n2  1) S 22
,S 
n1  n2  2
2
p
Pooled Estimator of variance
t0  t / 2,n1  n2  2
or t0  t / 2,n1  n2  2
Case II.I


Ex. Two catalysts are being analyzed to determine how they affect the
mean yield of a chemical process. Specifically, catalyst 1 is currently in use,
but catalyst 2 is acceptable. Since catalyst 2 is cheaper, it should be
adopted, providing it does not change the process yield. A test is run in the
pilot plant and results in the data shown in Table. Is there any difference
between the mean yields? Use =0.05, and assume equal variances.
Case II.I
1.
Parameter of interest: µ1 and µ2, the mean process yield using C1, and C2
2.
H0: µ1-µ2=0 or H0: µ1=µ2 , H1: µ1≠µ2
3.
=0.05
4.
Test statistic is
5.
Reject H0 if
6.
Calculate t0;
7.
T0 
X1  X 2  0
1 1
Sp

n1 n2
, S p2 
(n1  1) S12  (n2  1) S 22
n1  n2  2
t0  t0.025,14  2.145 or t0  t0.025,14  2.145
Conclusion H0 cannot be rejected. At the 0.05 level of significant, we do not have a strong
evidence to conclude that C2 results in a mean yield differ from C1
Case II.II



Inference on the difference in means of two normal
distributions, variance unknown
 12   22
Case 2:
H 0 : 1   2   0
Hypothesis
H1 : 1   2   0
X1  X 2  0
S1 S 2

n1 n2

Test Statistic

We should reject H0 if t0  t / 2, or t0  t / 2,
T0* 
Degree of freedom
Case II.II

Ex. Arsenic concentration in public drinking water supplies is a potential health risk.
An article in the Arizona Republic reported drinking water arsenic concentration in
parts per billion (ppb) for 10 metropolitan Phoenix communities and 10 communities
in rural Arizona.
Case II.II
 12   22
Case II.II
1.
Parameter of interest: µ1 and µ2, the mean arsenic concentration of two
regions
2.
H0: µ1-µ2=0 or H0: µ1=µ2 , H1: µ1≠µ2
3.
=0.05
4.
Test statistic:
5.
We should reject H0 if t0  t0.025,
6.
Compute t0
T0* 
X1  X 2  0
S1 S 2

n1 n2
or t0  t0.025,
Case II.II
7. Conclusion: t0<t0.025,13,
we reject H0. There is evidence to conclude that the
mean arsenic concentration in the drinking water in rural Arizona is differ
from the mean arsenic concentration in metropolitan Phoenix. Furthermore,
the mean arsenic is higher in rural of Arizona. P value is approximate
P=0.016
Case II

Sample size can be approximated by OC curves

Only for the case that 1= 2

Where d  |    0 |

Ex.
2
and n*  2n  1
Case III: Paired t- test



A special case of the two-sample t-test. This test is used when the
observations on the two populations of interest are collected in pairs.
Each pair of observations is taken under homogeneous condition.
Ex. We are interested in comparing two different types of tips for a
hardness-testing machine.
Tip1
Tip2
Sheet Metal
Pair t-test
Comparing the
depth of the
depression caused
by the tips
Tip1
Tip2
2 sample test
Case III: Paired t-test



Paired t-test
H 0 : 1   2   0
Hypothesis
H 0 : D  0
H1 : 1   2   0
H1 :  D   0
Test Statistic
T0 

D  0
SD n
We should reject H0 if
t0  t / 2,n 1 or t0  t / 2,n 1
Case III: Paired t-test

Ex. An article in the journal of Strain Analysis compares several methods for
predicting the shear strength for steel plate girders. Data for two of these
methods, Karlsruhe and Lehigh procedures, when applied to nine specific
girders are shown in table. We wish to determine whether there is any
difference between the two methods.
Case III: Paired t-test
1.
Parameter of interest: the difference in mean shear strength between the
two methods µD=µ1-µ2
2.
H0: µD=0, H1: µD≠0
3.
=0.05
4.
5.
6.
7.
D  0
T

Test statistic: 0 S n
D
We should reject H0 if t0  t0.025,8  2.306 or t0  t0.025,8  2.306
Calculate t0
t0 
0.2739  0
 6.08
0.1351 9
T0 =6.08>2.306, we conclude that the strength prediction methods yield
different results. Specifically, the data indicate that the Karlsruhe method
procedures, on the average, higher strength predictions than does the
Lehigh method. P value for t0 = 6.08 is P=0.0003
Case III: Paired t-test

Confidence Interval
Case III: Paired t-test

Ex
Inference on the variance two normal
distribution

Hypothesis
H 0 : 1   2
H1 :  1   2
S12
F0  2
S2

Test Statistic

We should reject H0 if
f 0  f / 2,n1 1,n2 1 or f 0  f1 / 2,n1 1,n2 1
f
1 ,u ,v 
1
f ,v ,u
Inference on the variance two normal
distribution

Ex
Inference on the variance two
normal distribution
Inference on the variance two
normal distribution



Sample size: can be approximated by OC curve
Only for the case that n1=n2=n



Where 
Ex
1
2

Inference on the variance two
normal distribution

Confidence Interval on the ratio of two variances
Inference on the variance two
normal distribution
Ex.
Inference on the variance two
normal distribution
Test on two population proportion

Hypothesis
H 0 : p1  p2
H1 : p1  p2
Pˆ1  Pˆ2
X  X2
, Pˆ  1
n1  n2
1 1
Pˆ (1  Pˆ )(  )
n1 n2

Test Statistic

We should reject H0 if
Z0 
z0  z / 2
or
z0   z / 2
Test on two population proportion
Ex.
Test on two population proportion
Test on two population proportion

Type II error
Test on two population proportion

Sample size

For one sided, replace /2 by 
Test on two population proportion

Confidence Interval on the difference in Population
proportions
Test on two population proportion

Ex
Test on two population proportion
Homework

1.
Using Minitab program to find the conclusion of these problems.
An article in solid state technology describes an experiment to determine the effect of the
C2F6 flow rate on the uniformly of the etch on a silicon wafer used in integrated circuit
manufacturing. Data for two flow rate are as follows:
C2F6
flow
rate
observation
1
2
3
4
5
6
125
2.7
4.6
2.6
3.0
3.2
3.8
200
4.6
3.4
2.9
3.5
4.1
5.1
a)
Does the C2F6 flow rate affect average etch uniformity? Use =0.05
b)
What is the P-value for the test in a)
c)
Does the C2F6 flow rate affect the variability in etch uniformity? Use =0.05
d)
Draws box plots to assist in the interpretation of the data from this etch uniformity.
Homework
2. A computer scientist is investigating the usefulness of two different design languages in
improving programming task. Twelve expert programmers, familiar with both languages, are
asked to code a standard function in both language, and the time (in minutes) is recorded. The
data follow:
a)
Is the assumption that the difference in coding time is normally distributed reasonable?
b)
Find P-value for the test in a)
c)
Find a 95% confidence interval on the difference in mean coding times. Is there any indication
that one design language is preferable?
Download