Prof. KG Satheesh Kumar
Asian School of Business
• Under Indian legal system, an accused is assumed innocent until proved guilty
“beyond a reasonable doubt”.
• This is called null hypothesis .
We may write: H
0
: Accused is innocent
• Court holds the null hypothesis as true until it can be proved, based on evidence, and beyond reasonable doubt, that it is false
• If H
0 is proved to be false, it is rejected and an alternative hypothesis , H
1
, is accepted
• We may write:
H
1
: Accused is not innocent, hence guilty.
• If H
0 cannot be proved to be false, beyond reasonable doubt, then it cannot be rejected and is hence accepted
Accused: Innocent
Verdict: Acquittal
Accused: Innocent
Verdict: Conviction
Accused: Guilty
Verdict: Acquittal
Type II Error
Type I Error
Accused: Guilty
Verdict: Conviction
• Company claims 2 lit volume; consumer advocate wants to test the claim
H
0
: Mean Volume >= 2 lit
H
1
: Mean Volume < 2 lit
• Consumers are happy, but company suspects that there is overfilling
H
0
: Mean Volume <= 2 lit
H
1
: Mean Volume > 2 lit
• The plant engineer wants to take corrective action if the average volume is more than or less than 2 litres
H
0
: Mean Volume = 2 lit
H
1
: Mean Volume
2 lit
• Random variable and its probability ( click )
distribution / probability density function
• The Normal Distribution ( click )
• Sampling and Sampling Distribution ( click )
• Estimation (
)
• A thesis is something that has been proven to be true
• A hypothesis is something that has not yet been proven to be true
• Hypothesis testing is the process of determining, through statistical methods, whether or not a given hypothesis may be accepted as true
• Hypothesis testing is an important part of statistical inference – making decisions about the population based on sample evidence
Setting up and testing hypotheses is an essential part of statistical inference.
In order to formulate such a test, usually some theory has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved.
E.g.: Claiming that a new drug is better than the current drug for treatment of the same symptoms.
A Good Hypothesis should
Be written as a simple, clear and precise statement
Be testable with a straightforward experiment
Predict the anticipated results in clear form
State relationship between variables
Be consistent with observation and known facts
The question of interest is simplified into two competing claims / hypotheses between which we have a choice; the null hypothesis, denoted H
0
, against the alternative hypothesis, denoted H
1
.
These two competing claims / hypotheses are not however treated on an equal basis: special consideration is given to the null hypothesis. In fact only the null hypothesis is tested whether to reject or not.
• Hypothesis is a testable assertion about the population (value of a parameter)
• Null hypothesis (H o
) is an assertion held true unless we have sufficient statistical evidence to conclude otherwise
• The alternative hypothesis (H
1
, H negation of the null hypothesis. a
, or H
α
) is the
• The two are mutually exclusive. One and only one of the two can be true.
• The null hypothesis is often a claim made by someone and alternative hypothesis is the suspicion about that claim.
• There may be no claim; then what we wish to demonstrate is the alternative hypothesis and its negation is the null hypothesis.
• H
1 describes the situation we believe to be true and H o describes the situation contrary to what we believe about the population.
• Null hypothesis is the one which when true does not call for a corrective action. If the alternative hypothesis is true, some corrective action would be necessary.
• If the obtained statistics is unlikely to be true, what we reject is H o
• Note: The equality sign always appear in H o
.
0
1
• Ex 1: A pharmaceutical company claims that four out of five doctors prescribe the pain medicine it produces. Set up H o and H
1
to test this claim. ( Answer )
• Ex 2: A medicine is effective only if the concentration of a certain chemical is at least 200 ppm. At the same time, the medicine would produce an undesirable side effect if the concentration of the same chemical exceeds 200 ppm. Set up H
0
, H
1
. (
• Ex 3: A maker of golf balls claims that the variance of the weights of the company’s golf balls is controlled to within
0.0028 oz 2 . Set up hypotheses to test this claim (
)
• Ex 4: The average cost of a traditional openheart surgery is claimed to be $49,160. If you suspect that the claim exaggerates the cost, how would you set up the hypotheses? (
• Ex 5: A vendor claims that he can fulfill an order in at most six working days. You suspect that the average is greater than six working days and want to test the hypothesis. How will you set up the hypotheses? (
)
• Ex 6: At least 20% of the visitors to a particular store are said to end up placing an order. How will you set up hypotheses to test the claim?
• Ex 7: Web surfers will lose interest if downloading takes more than 12 seconds. If you wish to test the effectiveness of a newly designed web page in regard to download time, how will you set up the null and alternative hypotheses? (
•
Parametric test of hypotheses about population parameters:
– Mean (
); proportion (p) and variance (
2 ) using z, t and chi-square distributions
– Test of difference between two population means using t and z distributions
• paid observations; independent observations
– Test of difference between two population proportions using z distribution
– Test of equality of two population variances using F-distribution
– Analysis of variance for comparing several population means
•
Parametric tests are more powerful than non-parametric tests because the data are derived from interval and ratio measurements
•
Non-parametric tests are used to test hypotheses with nominal and ordinal data
– The Sign Test, The Runs Test, Wald-Wolfowitz Test, Mann-Whitney U Test, Kruskal-Wallis
Test, Chi-Square Test for Goodness of fit
•
An important assumption for parametric tests is that the population is approximately normal (or sample size is large). No such assumptions are required for nonparametric tests, which are hence also called, distribution-free tests.
• Set up the null and alternative hypotheses
• Decide on the significance level, α (standard values:
10%, 5%, 1%)
• Using a random sample, get sample statistic and then calculate test statistic
• Find the table value of test statistic corresponding to the required α value
• Compare the calculated and table values of the test statistic and interpret.
• Note: Only the null hypothesis is actually tested
• Four outcomes are possible
– Ho is true and is not rejected (Not an error)
– Ho is true, but is rejected (Type I error)
– Ho is false, but not rejected (Type II error)
– Ho is false and is rejected (Not an error)
• Type I error is when we reject a true null hypothesis
• Type II error is when we do not reject a false null hypothesis
• Left-tailed test: In case H o makes a “>=“ claim, then rejection occurs when the statistic is far below, i.e. on the left tail.
• Right-tailed test: H o makes a <= claim and rejection occurs on the right tail
• Two-tailed test: Ho makes a “=“ claim and rejection occurs on both tails.
• Rejection and non-rejection regions are marked in the distribution of the sample statistic and the test statistic for interpreting the test results.
• The p-value
– is the probability of getting a sample evidence at least as unfavorable as the sample statistic when the null hypothesis is actually true.
– is a “credibility rating” for H
0
– is the probability of Type I error
– is an approximate answer to the question,
“given the sample evidence, what is the probability that H o is true?”
• This is the maximum “set” probability of type I error.
Accordingly, α decides the policy to reject / accept H
0
.
• Policy: If p-value is less than α, reject H
0
• If p-value is not less than α, we do not to reject H
0 this does not mean that H
0 have sufficient evidence to reject H
0
.
, but is true. Only that we do not
• The selected value of α indirectly decides the probability of making a type II error. We use the symbol β for this probability.
• The fraction, 1 – α is called the confidence level. If α = 5%, the confidence level is
95%, which means we want to be at least
95% confident that H reject it.
o is false before we
• Optimal α; compromise between Type I and Type II errors; cost of each type of error; producer’s risk and consumer’s risk.
• Type II error, β is difficult to estimate; it depends on α, the sample size, and the actual population parameter.
• Power of a Test
– The complement of type II error, i.e. 1 - β is called the power of the test. It is the probability that a false null hypothesis will be detected by the test.
• Test Statistic
– A random variable, calculated from the sample evidence, and having a well-known probability distribution
– Mostly used are Z, t, χ 2 and F. The distributions of these random variables are well-known and tables are available.
– See tables of
and
distributions.
• Test statistic for mean is z or t. (See next slide)
Test statistic = (Sample mean – hypothesized population mean)/
SE; where SE is the Standard Error
• Test statistic for proportion (assuming large sample) is Z
Z = (sample proportion – p)/ SE; where p is the hypothesized population proportion and SE =
(pq)/n); q = 1-p
• Test statistic for variance,
χ 2 = (n-1) S 2 /
2 where S 2 is the sample variance and
2 is the hypothesized population variance
• When the null hypothesis is about the population mean, the test statistic is:
Z if population standard deviation,
is known
t if sample standard deviation, S is known.
When Z is used, Z = (sample mean -
)/(
/
n)
When t is used, t = (sample mean -
)/(S/
n)
In the latter case, use degrees of freedom as n-1
• It is necessary that either the population is normal or the sample size is large enough
• Ex 8: A certain medicine is supposed to contain an average of 247 ppm of a chemical. If the concentration exceeds 247 ppm, the drug may cause undesirable side effects. A random sample of 60 portions is tested and the sample mean is found to be
250 ppm and sample standard deviation
12 ppm. Perform a statistical hypothesis test at 1% and 5% significance. (
• Ex 9: In the above example, assume that there are no side effects, but we are told that the drug may be ineffective if the concentration is below 247 ppm. The sample evidence is the same as before.
Formulate and test the hypothesis. (
• Ex 10: In the above example, assume that side effects and effectiveness are both to be considered. The sample evidence is the same. Formulate and test the
• Ex 11: Certain eggs are stated to have reduced cholesterol content, with an average of only
2.5% cholesterol. A concerned health group wants to test whether the claim is true. A random sample of 100 eggs reveals a sample average content of 3.0% cholesterol with a standard deviation of 2.8%. Does the health group have
• Ex 12: A survey of medical schools indicates that
16% of the faculty positions are vacant. A placement agency conducts a survey to test this claim, using a random sample of 300 faculty positions and finds that 39 out of the 300 are vacant. Test the claim at 5% level of significance
• A variable associated with a random experiment like drawing a random sample from the population – the variable may be mean, proportion, variance
• A random variable is an uncertain quantity whose value depends on chance
• A random variable (denoted by X) takes a range of discrete values with some discrete probability distribution, P(X) or continuous values with some probability density, f(X).
• P(X) or f(X), as the case may be, can be used to find the probability that the random variable takes specific values or range of values
• If a random variable, X is affected by many independent causes, none of which is overwhelmingly large, the probability distribution of X closely follows normal distribution. Then X is called normal variate and we write
X ~ N(
,
2 ), where
is the mean and
2 is the variance
• A Normal pdf is completely defined by its mean, and variance,
2 . The square root of variance is called standard deviation
.
• If several independent random variables are normally distributed, their sum will also be normally distributed with mean equal to the sum of individual means and variance equal to the sum of individual variances.
The area under any pdf between two given values of X is the probability that X falls between these two values
• SNV, Z is the normal random variable with mean 0 and standard deviation 1
• Tables are available for Standard Normal
Probabilities
• X and Z are connected by:
Z = (X -
) /
and X =
+
Z
• The area under the X curve between X
1 and X is equal to the area under Z curve
2 between Z
1 and Z
2
.
• z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
• 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
•
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
• 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
• 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
•
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
• 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
•
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
• 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
• 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
•
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
• 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
• 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
• 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
• 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
•
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
• 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
• 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
•
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
• 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
•
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
• 2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
• 2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
• 2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
• 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
•
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
• 2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
• 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
•
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
• 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
•
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
• 3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
• 3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
•
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
• 3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
•
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
Standard
Normal
Probabilities
(Table of z distribution)
The z-value is on the left and top margins and the probability
(shaded area in the diagram) is in the body of the table
• The sampling distribution of x is the probability distribution of all possible values of
x for a given sample size n taken from the population.
• According to the Central Limit Theorem, for large enough sample size, n, the sampling distribution is approximately normal with mean
and standard deviation
/
n. This standard deviation is called standard error.
• CLT holds for non-normal populations also and states: For large enough n,
x ~ N(
,
2 /n)
• The value of an estimator (see next slide), obtained from a sample can be used to estimate the value of the population parameter. Such an estimate is called a point estimate .
• This is a 50:50 estimate, in the sense, the actual parameter value is equally likely to be on either side of the point estimate.
• A more useful estimate is the interval estimate, where an interval is specified along with a measure of confidence (90%, 95%, 99% etc)
• The interval estimate with its associated measure of confidence is called a confidence interval.
• A confidence interval is a range of numbers believed to include the unknown population parameter, with a certain level of confidence
• Population parameters (
,
2 , p) and
Sample Statistics (
x,s 2 , p s
)
• An estimator of a population parameter is a sample statistic used to estimate the parameter
• Statistic, x is an estimator of parameter
• Statistic, s 2 is an estimator of parameter
2
• Statistic, p s
is an estimator of parameter p
• z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
• 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
•
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
• 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
• 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
•
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
• 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
•
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
• 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
• 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
•
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
• 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
• 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
• 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
• 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
•
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
• 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
• 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
•
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
• 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
•
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
• 2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
• 2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
• 2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
• 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
•
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
• 2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
• 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
•
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
• 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
•
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
• 3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
• 3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
•
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
• 3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
•
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
Standard
Normal
Probabilities
(Table of z distribution)
The z-value is on the left and top margin and the probability
(shaded area in the diagram) is in the body of the table
The claim is the null hypothesis and its negation is the alternative hypothesis.
If p denotes the proportion of doctors prescribing the medicine, we set the hypotheses as:
H o
: p >= 0.8
H
1
Null hypothesis is the one which calls for no corrective action and the alternative hypothesis is the one that calls for corrective action
If
denotes the concentration of the chemical, we set up the hypotheses as:
Ho:
= 200 ppm
H1:
200 ppm
The claim is the null hypothesis. Using
2 to denote variance, the hypotheses can be set up as:
Ho:
2 <= 0.0028 oz 2
H1:
2 > 0.0028 oz 2
The claim is the null hypothesis and your suspicion (belief) is the alternative hypothesis.
If
denotes the average cost, the hypotheses are:
Ho:
>= $49,160
H1:
The claim is the null hypothesis and your suspicion is the alternative hypothesis.
If
denotes the average number of days to fulfill an order, the hypotheses are:
Ho:
<= 6
H1:
> 6
The claim becomes the null hypothesis.
Let p denote the proportion of visitors placing an order. Then the hypotheses will be set up as:
Ho: p >= 0.20
H1: p < 0.20
Corrective action is needed if average downloading time exceeds 12 seconds; so this forms H1.
Let
denote the average download time.
Then:
Ho:
<= 12 s
H1:
> 12 s
Let
denote the average ppm of the chemical. The hypotheses are:
Ho:
<= 247
H1:
> 247
Sample statistic,
x = 250; sample SD, s = 12 and sample size n =
60 (large sample); standard error, SE = 12/
60 = 1.55. Right-tailed test
Since we know only sample SD, test statistic follows t-distribution with degrees of freedom 59
Test statistic, t = (250-247)/1.55 = 1.936
From the table of t-distribution, one-tailed t-values for 59 df are: t
5%
= 1.671 and t
1%
= 2.390
Comparing the calculated and table values of the test statistic, we reject the null hypothesis at 5% level of significance (95% confidence); but do not reject null hypothesis at 1% level of significance (99% confidence level)
Let
denote the average ppm of the chemical. The hypotheses are:
Ho:
>= 247
H1:
< 247
Sample statistic,
x = 250 does not go against the null hypothesis and hence there is no ground to reject it
Let
denote the average ppm of the chemical. The hypotheses are:
Ho:
= 247
H1:
247
Test statistic, t = (250-247)/1.55 = 1.936
From the table of t-distribution, two-tailed t-values for 59 df are: t
5%
= 2.000 and t
1%
= 2.660
Comparing the calculated and table values of the test statistic, we do not reject the null hypothesis either at 5% level of significance (95% confidence); or at 1% level of significance (99% confidence level)
Let
denote the average % of cholesterol. The hypotheses are:
Ho:
<= 2.5
H1:
> 2.5
Sample statistic = 3; sample SD = 2.8 and sample size = 100 (large sample); standard error, SE = 2.8/
100 = 0.28. Right-tailed test
Test statistic, t = (3-2.5)/0.28 = 1.786
From the table of t-distribution, one-tailed t-values for 99 df are: t
5%
= 1.660 and t
1%
= 2.364
Comparing the calculated and table values of the test statistic, we reject the null hypothesis at 5% level of significance (95% confidence); but do not reject null hypothesis at 1% level of significance (99% confidence level)
Let p denote the proportion of vacant positions. Then the hypotheses are:
Ho: p >= 0.16
H1: p < 0.16
Left tailed test
Sample statistic = 39/300 = 0.13;
SE =
(0.16x0.84/300) = 0.0212
Calculated test statistic, Z = (0.13 – 0.16)/0.0212 = -1.415
From the table of Z-distribution, one-tailed Z
5%
(Negative because left-tailed)
= -1.645
Comparing calculated and table values of the test statistic, we do not reject the null hypothesis that 16% faculty positions are vacant