Uploaded by mubashrasheikh35

L13 Intro to hypothesis testing

Hypothesis testing
Testing of Hypothesis
• Often we end up with taking decisions based on samples:
• decision may be correct or it may be incorrect
•
Understanding hypothesis testing
•
Constructing null and alternative hypotheses
• Hypothesis: Statement about value of a population parameter
•
Type I and Type II errors
•
Power of a test
• Testing of hypothesis: to verify whether a statement about the value of a population
parameter should be rejected or not.
•
Population mean: σ known
•
Population mean: σ unknown
• The statement will be verified based on the information available from random samples.
• Either the statement will be rejected or the statement cannot be rejected (accepted) based
on the information available from samples.
• Two types of statement:
• null hypothesis and
• alternative hypothesis
1
Examples: Testing of Hypothesis
2
Testing of Hypothesis
• Which college is the best for engineering studies?
• Null hypothesis, denoted by H0, is a tentative/conventional assumption about
population parameter. Equality part always appears with H0.
• Which brand of television is more reliable – Sony, Samsung, LG,...?
• Alternative hypothesis, denoted by Ha or H1, is the opposite of what is stated in
the null hypothesis.
• Diagnosis of a disease by X-ray machine
• Testing efficiency of a new medicine available in market
• The alternative hypothesis is what the test is attempting to establish.
• Is there a change in average saving of people in a city in 1990 and 2019?
• If the information available from sample data contradicts the null hypothesis, we
shall reject it, otherwise, we say “we fail to reject” null hypothesis (that is same as
not accepting the alternative hypothesis).
• A recruiter wants to recruit a few students from IIT-K, how?
• Has the more advertising of a new magazine changed its sale?
• CBI is trying hard to arrest the culprit? Whom to arrest?
3
4
Examples: Testing of Hypothesis
Examples: Testing of Hypothesis
In the language of statistics, convicting the defendant is called:
A criminal trial:
In a trial a jury must decide between two hypotheses.
• rejecting the null hypothesis in favor of the alternative hypothesis.
The null hypothesis:
• Implies that the jury is saying that there is enough evidence to conclude that the
defendant is guilty (i.e., there is enough evidence to support the alternative
hypothesis).
H0: The defendant is innocent
The alternative hypothesis or research hypothesis:
H1: The defendant is guilty
If the jury acquits, it is stating that
The jury does not know which hypothesis is true. They must make a decision on
the basis of evidence presented.
5
• there is not enough evidence to support the alternative hypothesis.
• Notice that the jury is not saying that the defendant is innocent, only that there is not
enough evidence to support the alternative hypothesis. That is why we do not say
that we accept the null hypothesis, although most people in industry will say “We
accept the null hypothesis”.
6
Types of Errors
Decision H0 accepted
Reality
No error
H0 true
H0 false
Types of Errors
H0 rejected

Type I error 
(probability = α)
Type II error 
(probability = β)
No error

Critical
Value
• H0: the null hypothesis and H1: the alternative hypothesis
• Type-I error (α): Probability of rejecting null hypothesis when it is actually true
Accept H0
• Type-II error (β): Prob. of fail to reject null hypothesis when it is actually false
Reject H0
Reducing both type-I and type-II errors together is not
• Power of a test (1-β): Probability of rejecting null hypothesis when it is false
7
possible. Need a trade-off!
8
Constructing null and alternative hypotheses
Constructing null and alternative hypotheses
Ex.1. From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
• One-tailed and two-tailed test:
H 0 :   0
H 1 :   0
H 0 :   0
H 1 :   0
One-tailed
One-tailed
(lower-tail)
(upper-tail)
Or
Or
(Left-tailed) (Right-tailed)
H0 :   0
H 1 :   0
At 5% significance level, does the modified process increase the yield?
Sol.
Here H 0 :   500
Two-tailed
 one-tailed test; test for  and  known
H1 :   500
x   535  500

 2.57
   96 50


 n
At 95% confidence level, Z 0.95  1.64  from single tailed test of Z-table 
We formulate Z calculated 
• Probability of Type I error is  = P(H0 is rejected|H0 is true).
This is also called the level of significance.
As Z calculated  Z 0.95  reject the null hypothesis (i.e., enough evidence to
• Probability of Type II error is  =P(H0 is accepted|H0 is false)
accept the alternative hypothesis)
9
Constructing null and alternative hypotheses
10
One-tailed or two-tailed?
Ex.1. From long experience of coca-cola company, it is known that
HW.1. The mean weakly sales of a magazine was 146 units. After
yield is normally distributed with mean of 500 units and standard deviation
an advertisement campaign, mean of weakly sales in 22 stores for a
96 units. For a modified process, yield is 535 units for a sample of size 50.
typical week increased to 154 with a standard deviation of 17 units.
At 5% significance level, does the modified process increase the yield?
Was the advertisement successful at 5% significance level? It is given
Sol. Here H 0 :   500  this specifies a single value for the parameter 
that the weakly sales of magazine follows normal distribution.
Actually, we shall assume H 0 :   500
H1 :   500  this is what we want to test 
 one-tailed test; test for  and  known
Is calculated value
greater than the
critical value ?
11
12
One-tailed or two-tailed?
One-tailed or two-tailed?
Ex.2. A department store manager determines that a new billing system will be
cost effective only if the mean monthly account is more that $170. A random sample
of 400 monthly accounts is drawn, for which the sample mean is $178. It is known
that the accounts are approximately normally distributed with s.d. of $65.
Ex.3. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6,  2, 4,  4, 1,  6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 5% significance
level, assuming that the population is normal with variance 1.
At   5%, can we conclude that the new system will be cost-effective?
Sol.
System is cost effective if the mean account balance for all customers (population)
is greater than $170, that is, if   $170.
Our null hypothesis thus H 0 :   170
Sol. Formulate the hypothesis: H 0 :   0
H1 :   0
 Two-tailed test; test for  and  known.
H1 :   170  this is what we want to test 
 one-tailed test; test for  and  known
Acceptanc
e region
Does calculated value
fall in the rejection
region of H0 (that is
beyond the critical
values)?
13
One-tailed or two-tailed?
Steps of Hypothesis Testing
Ex.4. The mean weakly sales of a magazine was 146 units. After an advertisement
campaign, mean of weakly sales in 22 stores for a typical week increased to 154 with
a standard deviation of 17 units. Was the advertisement successful at 5% significance
Step 1. Develop the null and alternative hypotheses.
Step 2. Specify the level of significance .
level? It is given that the weakly sales of magazine follows normal distribution.
Sol. Formulate the hypothesis: H 0 :   146
Step 3. Collect the sample data and compute the test statistic.
H1 :   146
 One-tailed test; test for  and  unknown.
Ex.5. A state highway patrol periodically samples vehicle speeds at various
locations on a particular highway. The sample of vehicle speeds is used to test the
hypothesis H 0 :   65. A sample of 64 vehicles shows a mean speed of 66.2 kmph
Step 4. Based on , identify critical values.
Step 5. Reject H0 if the calculated test statistic value falls in
the rejection region or not (i.e., p-value < )
with a s.d. of 4.2 kmph. Use   0.05 to test the hypothesis.
Sol. Formulate the hypothesis: H 0 :   65
H1 :   65  One-tailed test; test for  ,  unknown.
14
15
16
Lower-tailed test for population mean (σ known)
Test Statistic
1. Test Statistic for population mean:
(a) when population variance is known
(b) when population variance is unknown:
2. Test Statistic for population variance:
z  x
/ n
t  x
S/ n
0
0
 2n 1 
 n  1 S 2
2
17
Upper-tailed test for population mean (σ known)
18
Two-tailed test for population mean (σ known)
Acceptance
region
19
20
Examples: Hypothesis Testing
Examples: Hypothesis Testing
Ex.1. From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
Ex.2. A department store manager determines that a new billing system will be
cost effective only if the mean monthly account is more that $170. A random sample
96 units. For a modified process, yield is 535 units for a sample of size 50.
of 400 monthly accounts is drawn, for which the sample mean is $178. It is known
that the accounts are approximately normally distributed with s.d. of $65.
At 5% significance level, does the modified process increased the yield?
At   5%, can we conclude that the new system will be cost-effective?
Sol.
Step1:Here H 0 :   500
 one-tailed test; test for  and  known
Sol.
Step1:Here H 0 :   170
H1 :   500
Step 2: From sample data, we formulate zcalculated 
H1 :   170
x   535  500

 2.57
   96 50


 n
x   178  170

 2.46
   65 400


 n
 1.64  from single tailed test of Z-table 
Step 2: From sample data, we formulate zcalculated 
Step 3: At 95% confidence level, z0.05  1.64  from single tailed test of Z-table 
Step 3: At 95% confidence level, z0.05
Step 4: As zcalculated  z0.05  reject the null hypothesis (i.e., enough evidence to
accept the alternative hypothesis)
 one-tailed test; test for  and  known
21
Step 4: As zcalculated  z0.05  reject the null hypothesis (i.e., accept H1 )
Examples: Hypothesis Testing
Examples: Hypothesis Testing
Ex.3. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6,  2, 4,  4, 1,  6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 95% confidence
Ex. 4. The mean weakly sales of a magazine was 146 units. After an advertisement
campaign, mean of weakly sales in 22 stores for a typical week increased to 154 with
a standard deviation of 17 units. Was the advertisement successful at 5% significance
level? It is given that the weakly sales of magazine follows normal distribution.
level, assuming that the population is normal with variance 1.
Sol.
Step 1: Formulate the hypothesis: H 0 :   0, H1 :   0
Sol.
Step 1: Formulate the hypothesis: H 0 :   146
 Two-tailed test for  and  is known.
Step 2: From sample data, we formulate zcalculated 
x 

H1 :   146
0.4  0

 1.265
1 10
n
 One-tailed test; test for  and  unknown.

Step 2: From sample data, we formulate tcalculated 
Step 3: At 95% confidence level, z0.025  1.96,  z0.025  1.96

from two tailed test of Z-table, we find  z 2


x 
S
n


154  146
 2.501
15 22
Step 3: For   0.05 and 21 dof , t21,0.05  1.72  from one tailed test of t-table 
Step 4: As tcalculated  t21, 0.05  reject the null hypothesis (i.e., accept H1 )
Step 4: As zcalculated does not fall in the rejection region, we fail to reject H 0 .
 We can believe that the drug has no effect on change of the mean blood pressure
22
23
 We can conclude that the advertisement was successful.
24
p-Value Approach
Upper-tailed test for population mean (σ known)
Step 1. Develop the null and alternative hypotheses.
Step 2. Specify the level of significance .
Step 3. Collect the sample data and compute the test statistic.
Step 4. Use the value of test statistic to compute the p-value.
Step 5. Reject H0 if the calculated test statistic value falls in
the rejection region or not (i.e., p-value < a)
Sometimes, though we have taken a decision about which hypothesis is to
accept, we still want to support it by seeing whether it is deep inside the
critical region or on border. This can be decided using tail probability or pvalue.
25
Examples: Hypothesis Testing (using p-value)
Examples: Hypothesis Testing (using p-value)
Ex. 5. From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
Ex. 6. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6,  2, 4,  4, 1,  6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 95% confidence
level, assuming that the population is normal with variance 1.
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol.
Step1:Here H 0 :   500
Sol.
Step 1: Formulate the hypothesis: H 0 :   0, H1 :   0
 one-tailed test; test for  and  known
 Two-tailed test for  and  is known.
H1 :   500
Step 2: From sample data, we formulate zcalculated 
x   535  500

 2.57
   96 50


 n
Step 3: For z  2.57, cummulative probability = 0.9949 (from table)
Step 2: From sample data, we formulate zcalculated 

x 

n


0.4  0
 1.26
1 10
Step 3: For z  1.26, cummulative probability = 0.8962 (from table)
 p  value  2 1  0.8962   0.2076
Step 4: p  value   , we fail to reject H 0 .
 p  value  1  0.9949  0.0051
Step 4: As p  value    reject the null hypothesis
26
27
 We can believe that the drug has no effect on change of the mean blood pressure
28
Examples: Hypothesis Testing
Examples: Normal
Distribution
HW. 1. A state highway patrol periodically samples vehicles speeds at various
locations on a particular highway. The sample of vehicle speeds is used to test the
hypothesis H 0 :   65. A sample of 64 vehicles shows a mean speed of 66.2 kmph
with a s.d. of 4.2 kmph. Use   0.05 to test the hypothesis.
HW. 2. A company claims that their phone bills are such that customers won't see
a difference in their phone bills between them and their competitiors. To verify, a
random sample of 125 customers are selected, and the sample mean and sample s.d.
are calculated as $17.09 and $3.87, respectively. Use   0.05 to test the hypothesis.
HW. 3. In 64 randomly selected hours of production, the sample mean and sample
s.d. of the number of acceptable pieces produced by an automatic machine are 1038
and 146, respectively. At   0.05, does this enable us to reject H 0 :   1000 against
29
Examples: Hypothesis Testing
H1 :   1000?
30
Examples: Hypothesis Testing
HW 6. To test the null hypothesis that population mean is 4  H 0 :   4  against
HW. 4. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6,  2, 4,  4, 1,  6, 0, 0, 2. Is it reasonable to believe that the
alternative hypothesis   5, a test is designed based on a random sample of size 49.
drug has no effect on change of the mean blood pressure? Test at   0.05.
It is decided that the null hypothesis will be rejected if the observed sample mean
x  4.3. If the population variance is 9, find (a) the distribution of X , assuming H 0
true, (b) the distribution of X , assuming H1 true, and (c) type-I and type-II errors.
Hint. Using CLT, we can assume that X  N   ,  2 n  . Here  2  9, n  49.
(a) If H 0 true,   4. Thus, X  N  4,9 49 
HW. 5. Five measurements of tar content of a certain kind of cigarette yielded
(c) Type-I error   = P  reject H 0 H 0 true = P  X  4.3 when X  N  4,9 49 
14.5, 14.2, 14.4, 14.3, 14.6 mg per cigarette. Show that the difference between mean
of this sample and the average tar   14 claimed by manufacturer is significant at level
 X  4 4.3  4 
 P

 = P  Z  0.7   1  P  Z  0.7 
37 
 37
= 1  0.7580
of significance 0.05. Assume normality.
Hint. H 0 :   14
H1 :   14
31
 0.2420
32