INTO University of East Anglia Quantitative Methods for Economics Graduate Diploma Coursework Assignment Student Name: Vasechko Dmitriy Student Number: 004137 Group: GDB SECTION A Q1. The following table contains grouped data on the weekly wages of 75 workers in a particular industry. Wages £250.00-£259.99 £260.00-£269.99 £270.00-£279.99 £280.00-£289.99 £290.00-£299.99 £300.00-£319.99 £320.00-£379.99 Total (a) Construct a histogram for the frequency distribution. Be careful to take account of the varying interval width. 250.00 (b) Frequency 8 10 16 15 10 10 6 75 260.00 270.00 280.00 290.00 300.00 Derive the cumulative frequency distribution. 320.00 379.99 Wage frequency Frequenc Valid y Percent Percent Cumulative Percent Valid 250.00259.99 8 10.7 10.7 10.7 260.00269.99 10 13.3 13.3 24.0 270.00279.99 16 21.3 21.3 45.3 280.00289.99 15 20.0 20.0 65.3 290.00299.99 10 13.3 13.3 78.7 300.00319.99 10 13.3 13.3 92.0 320.00379.99 6 8.0 8.0 100.0 75 100.0 100.0 Total (c) Construct an ogive. 250.00-259.99 260.00-269.99 270.00-279.99 280.00-289.99 290.00-299.99 300.00-319.99 320.00-379.99 (d) Use the ogive to estimate the median wage. Place an interpretation on this value. Median is the middle – the middle item when data is placed in ascending (or descending) order. Median = (n+1)/2. In our case, the median wage should ba at point №4 it is about 279.99. (e) Use the frequency distribution to estimate the mean wage. What does a comparison of mean and median tell us about the nature of the distribution? Class 250.00-259.99 260.00-269.99 270.00-279.99 280.00-289.99 290.00-299.99 300.00-319.99 320.00-379.99 Midpoint 255 265 275 285 295 310 350 total Frequency 8 10 16 15 10 10 6 75 Frequency X Midpoint 2040 2650 4400 4275 2950 3100 2100 21515 Estimate of Mean = 21515/75 = 286.87 Compare the Mean and Median: Mean=286.87, Median=279.99, => Mean > Median There is little difference between Median and Mean. It means that the average of the distribution in wages might be from £279.99 to £286.87. Q2. To obtain a driving licence, you need to pass a multiple choice test on the highway code. Let's assume that the test contains 12 questions. For each question you must choose between three possible answers of which only one is correct. You pass the test with nine or more correct answers. If you don't know an answer, you choose randomly from the three possibilities. Joe has turned up to take the test. He is completely ignorant of the highway code, and guesses all 12 answers. Let X be the number of questions he answers correctly. (a) Find the probability distribution of X. This is binomial distribution, in this case we can count probabilities with this formula: Px(x)=P(correct)x * P(incorrect)(n-x) * nCx P(correct)= 1/3, P(incorrect)=2/3, n=12 Px(0) = 0.008183 Px(1) = 0.048364 Px(2) = 0.131015 Px(3) = 0.215099 Px(4) = 0.238374 Px(5) = 0.187853 Px(6) = 0.107945 Px(7) = 0.045572 Px(8) = 0.014029 Px(9) = 0.003071 Px(10) = 0.000454 Px(11) = 0.000041 Px(12) = 0.000002 (b) What is the probability of Joe, in his complete ignorance, passing the test? Do you think the pass-mark is high enough? Explain. Joe will pass the test only if he get 9 or more answers right: P(pass) = Px(9) + Px(10) + Px(11) + Px(12) = 0.003568 ≈ 0.36% The probability of Joe, in his complete ignorance, passing the test is about 0.36%. I think that pass mark is low because in this situation somebody without any knowledge can pass this test and he will become the cause of traffic accident. (c) What is the probability of Joe answering twelve out of twelve questions correctly? From the part “a” we can see that Px(12) = 0.000002 The probability of Joe answering twelve out of twelve questions correctly is 0.000002 it’s about 0. 0002 % (d) Should Joe feel really unlucky if he gets no answers right? Explain. From the part “a” we can see that Px(0) = 0.008183 ≈ 0.82% No, I think he should not feel really unlucky because the probability is not as low as the chance of getting 12 of 12 answers right, the chance of getting no answers right is about 1/100. Q3. From a travel brochure, you record the weekly rental charges on holiday villas in Spain and Portugal. You obtain this information on 8 Spanish properties and 18 Portuguese properties. The rental charges are, when converted into pounds (£): Spain: 180, 200, 210, 230, 240, 270, 310, 310 Portugal: 130, 150, 160, 170, 180, 190, 200, 200, 200, 210, 210, 220, 230, 240, 240, 260, 260, 280 You are interested in the population mean rental charge for each country. (a) For each country find a 95% confidence interval for the population mean rental charge. Let’s use SPSS for calculating Mean, Standard Deviation and Variance of samples. Statistics Spain Portugal N Valid Missing Mean Std. Deviation Variance 8 18 10 0 243.75 207.22 48.972 40.410 2.398E3 1.633E3 S . n The 95% confidence interval for the population is given by: X t n 1,0.025 Our samples sizes are less than 25, let’s use T distribution table to find out tn-1. Data: Spain (t7= 2.37); Portugal (t 17= 2.11) The 95% confidence interval for the population of Spain will be: 243.75-2.37*(48.972/2.828) < x̄ < 243.75+2.37*(48.972/2.828) 202.71 < x̄ < 284.79 The 95% confidence interval for the population of Portugal will be: 207.22-2.11*(40.410/4.24) < x̄ < 207.22+2.11*(40.410/4.24) 187.11 < x̄ < 227.33 (b) Which of the two confidence intervals is narrower, and why? The Portugal confidence interval is narrower. Because the variance of Portugal is lower than the variance of Spain, and that tell us that data is closer to the mean. (c) Assume that the two populations have the same variance. Estimate this common variance using the formula for the “pooled sample variance”. Deduce the “pooled standard deviation”. The formula is: s p 𝑆𝑝 = √ n 1 sx2 m 1 s y2 nm2 (8 − 1)2398 + (18 − 1)1633 = √1856.125 = 43.08 8 + 18 − 2 The “pooled standard deviation” is 43.08! (d) Conduct a test of whether there is a difference between the population means of the two countries (hint: 2-tailed test). Quote the p-value. Let’s imagine 2 hypotheses: H0: 1 = 2 H1: 1 2 Where 1 is the mean of Spain and 2 is the mean of Portugal. Significance level Is 5% and it’s 2-tailed test, that means that critical value 1.96 Let’s calculate the test statistic using the formula: X1 X 2 t Sp 1 1 n1 n2 = 243.75-207.22 =1.996 1 1 43.08 8 18 We get 1.996>1.96 and in this case we reject H0 because it falls into rejection region. That means that there is a difference between the population means of the two countries. (e) Again assuming equal variances, conduct a test of whether prices are higher, on average, in Spain than in Portugal. Quote the p-value. What is the relationship between the p-value of this test and the p-value in part (d)? Let’s imagine 2 hypotheses: H 0 : 1 2 H1 : 1 2 Where 1 is the mean of Spain and 2 is the mean of Portugal. Significance level Is 5% and it’s 1-tailed test, that means that critical value 1.645 Let’s calculate the test statistic using the formula: t X1 X2 Sp 1 1 n1 n 2 = 243.75-207.22 =1.996 1 1 43.08 8 18 We get 1.996>1.645 and in this case we do not reject H0 because it falls into norejection region. That means that prices are higher, on average, in Spain than in Portugal. Difference between two p-values is that in first case our critical value was 1.96 and in second case it was 1.645. SECTION B Q4. You are interested in house prices in East Anglia. In particular, you are interested to know if there is a significant difference in house prices between Cambridge and Norwich. You collect a random sample of 10 terraced houses in Cambridge, and a random sample of 15 terraced houses in Norwich, and record their prices. Descriptive statistics for the two samples are contained in the following table. Prices are measured in thousands of pounds. sample size sample mean sample standard deviation (a) Cambridge 10 238.16 14.76 Norwich 15 211.34 18.62 We are going to assume that the population standard deviation of prices is the same in the two cities. Combine the two sample standard deviations to obtain a “pooled” sample standard deviation, Sp. The formula is: s p n 1 sx2 m 1 s y2 nm2 (10 − 1)14.762 + (15 − 1)18.622 𝑆𝑝 = √ = √296.286 = 17.21 10 + 15 − 2 The “pooled standard deviation” is 17.21! (b) Conduct a two-sample t-test in order to see if there is a significant difference in prices between the two cities. Quote the p-value. Let’s imagine 2 hypotheses: H0: 1 = 2 H1: 1 2 Where 1 is the mean of Cambridge and 2 is the mean of Norwich. Significance level Is 5% and it’s 2-tailed test, that means that critical value 1.96 Let’s calculate the test statistic using the formula: t X1 X2 Sp 1 1 n1 n 2 = 238.16 211.34 =3.817 1 1 17.21 10 15 We get 3.817>1.96 and in this case we reject H0 because it falls into rejection region. That means that there is a significant difference in prices between the two cities. (c) Suppose you have an a priori belief that house prices are higher in Cambridge than in Norwich. Conduct a test in order to see if your belief is correct. Quote the p-value. Let’s imagine 2 hypotheses: H0: 1 = 2 H1: 1 2 Where 1 is the mean of Cambridge and 2 is the mean of Norwich. Significance level Is 5% and it’s 1-tailed test, that means that critical value 1.645 Let’s calculate the test statistic using the formula: X1 X2 t Sp 1 1 n1 n 2 = 238.16 211.34 =3.817 1 1 17.21 10 15 We get 3.817>1.645 and in this case we do not reject H0 because it falls into norejection region. That means that prices are higher, on average, in Cambridge than in Norwich. Q5. Explain the meaning of: i. ‘Probability’ and ‘conditional probability’ Probability is the chance that an event will occur or has occurred in a certain trials. Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written P(A|B). ii. The ‘probability space’ and ‘mutually exclusive events’. A probability space is a measure space that the measure of the whole space is equal to 1. Mutually exclusive events mean events which cannot occur at the same time. iii. ‘Relative frequency’ and ‘cumulative frequency’. Relative frequency is the numbers of times the event happened in a period of times and attempted. Cumulative frequency is the frequency of a random variable below a particular level. It tells how often the value of the random variable is less than or equal to a particular reference value. iv. The ‘sample mean’ and the ‘mean of the population’. Sample mean is the average of the sample (a small group from the population). Mean of the population is the average of the population involving the whole example in a particular area. v. A ‘random variable’ and a ‘confidence interval’. The ‘variance’, ‘standard deviation’ and ‘expected value of an estimator’. Random variable is a variable whose values are random but whose statistical distribution is known. Confidence interval is a range of estimated outcome of a certain population. x x 2 Variance is a measure of statistical dispersion. The formula is s 2 2 /n n 1 Standard deviation is the square root of variance and it is a measure of statistical dispersion as well. Standard deviation tells us how tightly data is spread around the mean (average) mean in a set of data. Expected value of an estimator may be interpreted as the long-term average value of a set of data. vi. The’Binomial’ and the ‘Poisson’ distributions. Binomial distribution means there are n identical trials. Each trial has only two possible outcomes – success or failure (0 or 1, True or False, Head or Tail). The probability of each trial is the same and the result of each trial is independent of other trials. Poisson distribution means the ‘rare event’ distribution and it used when the probability of occurrence is low. vii. ‘Chebyshev’s inequality’ and the ‘Central Limit Theorem’. Chebyshev’s inequality states that in any data sample or probability distribution, nearly all the values are close to the mean value. Central limit theorem means that the sample which is taken from a large population will be approximately normally distributed. viii. ‘Biasness’, ‘efficiency’ and ‘skewness and kurtosis’. Biasness means the statistic is biased and its expected value is not true value. Efficiency is an unbiased statistic and the formula is: Skewness is a measure of the asymmetry of the probability distribution of random variable (can be Positive (The right tail is longer) & Negative (The left tail is longer)). Kurtosis describes the relative peak or flat of a distribution compared to the normal distribution. ix. ‘Point estimation’ and ‘Interval estimation’. Point estimation is a single value. Interval estimation is a range of value and it expresses the degree of uncertainty. Q6. Explain the following (Use short notes, equations, diagrams and/or calculations): i. 1. P( Three basic rules of probability theory ) = 0 The smallest possible probability is 0. (null set ). 2. P(S) = 1 The largest possible probability is 1. (S means the sample space) 3. 0 1 P(E) ii. Probabilities exist on interval from 0 to 1, Bayes’ theorem. Bayes' theorem relates the conditional and marginal probabilities of events A and B, where B has a non-vanishing probability. The formula is P( A | B) iii. P( A B) P( B | A) P( A) P( B) P( B) The characteristics of the standard normal distribution 1. Symmetry about its mean μ. 2. Unimodal. 3. Extend from x= - to + 4. Bell-shaped curve. 5. The mode and median both equal the mean μ. iv Four characteristics of time series type data. ....... v A and B are events such that Pr(A) = 0.4 and Pr(B) = 0.75, find: a) Pr(B) if A and B are mutually exclusive. Pr(B)=0.75 ? (is it possible? Pr(A)+Pr(B) should be not more than “1” !!?) b) Pr(B) if A and B are independent. Pr(B)=0.75 c) The probability that A and B occur simultaneously. P(A and B)=P(A)*P(B)=0.4*0.75=0.3 Q7. (a) Suppose that Y ~ N (6, 2) and Y is the mean of a simple random sample of size n. Find: (i) Pr [ Y >8]; At first let’s find z: z 86 1.41 2 Then let’s look at standard normal distribution table for z>1.41 P(z>1.41)= 0.079 (ii) Pr [ Y >8] when n = 1; At first let’s find z: z 86 1.41 2 /1 Then let’s look at standard normal distribution table for z>1.41 P(z>1.41)= 0.079 (iii) Pr [ Y >8] when n = 2; At first let’s find z: z 86 2 2/2 Then let’s look at standard normal distribution table for z>2 P(z>2)= 0.023 (iv) Pr [ Y >8] when n = 5; At first let’s find z: z 86 3.16 2/5 Then let’s look at standard normal distribution table for z>3.16 P(z>3.16)= 0.0008 (v) Sketch on the same axes the sampling distribution of Y for n = 1,2,5 (b) In a population 60% of all adults own a car. If a simple random sample of 100 adults is taken, what is the probability that at least 70% of the sample will be car owners? Mean (population) =0.6 Mean (sample) =0.7 N=100 Variance (population) =p*(1-p)/n=0.0024 At first let’s find z: z 0.7 0.6 2.04 0.0024 Then let’s look at standard normal distribution table for z>2.04 P(z>2.14)= 0.021 Probability that at least 70% of the sample will be car owners is 2.1%. (c) When set correctly a machine produces hamburgers of mean weight 100g each and standard deviation of 5g. The weight of hamburgers is normally distributed. These hamburgers are sold in packets of four. With Y the weight of an individual hamburger in grams, Y ~ N 100,25 i. What is the sampling distribution of the total weight of hamburgers in a packet? Total weight=4*100=400. Total variance=4*25=100. So the mean of total weight of hamburgers in a packet is 400g and the variance is 100g. In this way weight of a packet hamburger in grams: Y ~ N(400, 100) ii. Explain what results you are using and what assumptions you make. There are four hamburgers in a packet. If the weight of one hamburger is infinitely that means that weight of packet of hamburgers is infinitely as well. As a result the mean of the total weight of hamburgers in a packet is four times the means of an individual weight of a hamburger like 4*100=400g. Through that, we can deduce the variance of a total weight of hamburger in a packet is 4*25=100g. (d) A customer claims that packages of hamburgers are underweight. A trading standards officer (government person responsible for maintaining quality) is sent to investigate. He selects one packet of four and finds that the weight of hamburgers in it is 390g. i. What is the probability of a packet weighting as little as 390g if the machine is set correctly? At first let’s find z: z 390 400 1 100 Then let’s look at standard normal distribution table for z>-1 P(z>-1)= 0.1587 The probability of a packet weighting as little as 390g if the machine is set correctly is 15,87% ii. Would you describe the findings as clear ‘evidence’ that the machine is set to deliver underweight hamburgers? No, we cannot accept that the machine is set to deliver underweight hamburgers because it was only one trial, and to make a clear ‘evidence’ it is not enough (to make clear evidence you need a bigger sample (the bigger the sample the more clear ‘evidence’). Q8. (a) A simple random sample of 15 people attending a certain school is found to have and average IQ score of 107.3. While a random sample of 12 pupils at another school has an average IQ score of 104.1. Obtain a 95% confidence interval for the difference between the IQ scores of pupils at the two schools when: ( X1 X )~ 2 N ( , 1 2 2 2 1 2 n 1 i. n 2 Normal distributions ) 2 The true variances of the IQ scores for children at the two schools are 39 and 58 respectively. Using the formula 𝑋̅1 − 𝑋̅2 ± 1.96√ 𝜎12 𝑛1 + 𝜎22 𝑛2 = 3.2 ± 5.34 = [−2.14; 8.54] The 95% confidence interval for the difference between the IQ scores of pupils in two schools is [ -2.14, 8.54] ii. State your assumptions. The probability of average IQ scores of pupils in school A is higher than in school B. iii. The true variances of the IQ scores for children are unknown, but the sample variances are 32.5 and 56.5 respectively. sp The formula is: n 1 sx2 m 1 s y2 nm2 (15 − 1)32.5 + (12 − 1)56.5 𝑆𝑝 = √ = √43.06 = 6.56 15 + 12 − 2 The “pooled standard deviation” is 6.56. Using the t-distribution found that t25 =2.06 1 1 𝑛 𝑚 Then: 𝑋̅1 − 𝑋̅2 ± 2.06𝑆𝑝 √ + = 3.2 ± 2.06 ∗ 6.56 ∗ 0.3873 = [−2.03; 8.43] The confidence interval for the difference between of the IQ scores of pupils in two schools is [-2.03, 8.43] iv. State your assumptions. Totally we can conclude that 12 2 2 v. 2 Do you consider that there is strong evidence for a claim that pupils attending the first school have a higher mean IQ score than those attending the second school? No, the size of sample is not enough big to make clear evidence that pupils attending the first school have a higher mean IQ score than those attending the second school. (b) In an opinion poll based on 100 interviews, 34 people say that they are not satisfied with the level of support from local government. Find a 99% confidence interval for the true proportion of people who are not satisfied with their local government services? Pr(dissat)=0.34 Significant level is 1% and it’s 2-tailed test, that means that critical value z=2.576 Interval estimate: 𝑝 ± 2.576√ (c) 𝑝(1−𝑝) 𝑛 = 0.34 ± 2.57 ∗ 0.0474 = [0.2183; 0.4617] Weekly wages in a particular industry are known to be normally distributed, but the variance of this distribution is unknown. An accountant claims that mean weekly income in this industry is £72.40. A random sample of 15 workers yields a mean income of £73.20 with a sample standard deviation of £2.10. Do a test to find out if the accountant’s information is out of date, using a level of significance of 5%. H0 : = £72.40 H1 : £72.40 Significant level is 5%, let’s look for critical value in T-table, z=2.13 Calculate the test statistic with formula: 𝑧 = 𝑋̅ −𝜇 √𝑠 2 /𝑛 = 73.20−72.40 √2.102 /15 = 1.475 We get 1.475 < 2.13 and in this case we do not reject H0 because it falls into no-rejection region. That means that mean weekly income in this industry is £72.40. Q9. (a) You are a member of a scientific team of advisors considering whether the recent outbreaks of bird flu in Norfolk (a bird illness that can be transferred to humans) and elsewhere have any health consequences for the general population. You need to do hypothesis testing. i. What would be your null hypothesis? H0 hypothesis is that recent outbreaks of bird flu have some health consequences for the general population. ii. Explain what Type I and Type II errors are in this context. Type I error is to reject H0 when recent outbreaks of bird flu have some health consequences for the general population. Type II error is to accept the null H0 when the outbreaks of bird flu have not some health consequences for the general population. iii. Outline the costs involved in making Type I and Type II errors. The Type I cost is higher than cost of Error Type 2 because in this case government will decide not pay enough attention to problem and more people can die because of illness. (cost = people life) The Type II cost is less because it will cause only spending more money on unreasonable thing. However, it does not affect people leading to any health consequences. (cost = money) iv. What advise would you give as to the strength of sample evidence required for rejection of your null hypothesis. It’s more preferable to take risk to make error Type II than error Type I. (b) Weekly wages in a particular industry are known to be normally distributed, with a standard deviation of £2.10. An accountant claims that mean weekly income in this industry is £72.40. A random sample of 35 workers yields a mean income of £73.20. i. State an appropriate null and alternative hypothesis. H0: 72.40 H1: 72.40 iii. Obtain the p-value for the hypothesis test. Significant level 5% in this case our critical value will be: z=1.96 Calculate the test statistic with formula: 𝑧 = 𝑋̅ −𝜇 √𝑠 2 /𝑛 = 73.20−72.40 √2.102 /35 = 2.25 P*= 0.024 iii. What does the p-value show? Then interpret the claim made by the accountant. P*=0.024=2.4%; 2.4% < 5% We get 2.4% < 5% in this case we should reject H0 because it falls into the rejection region. It means that mean weekly income in this industry is not equal to £72.40. Q 10 The regional water company is interested to know the effect of water metering on water consumption. It collects a random sample of 50 unmetered households and 100 metered households, and records their daily water consumption in litres. Summary statistics are presented in the following table. sample size mean of daily consumption standard deviation unmetered 50 305 45 metered 100 275 32 Let 1 be the population mean of daily consumption by unmetered households, and let 2 be the same for metered households. (a) Find the pooled standard deviation, sp. [3] Count Standard Deviation using the formula: s p n 1 sx2 m 1 s y2 nm2 (50 − 1)452 + (100 − 1)322 √ 𝑆𝑝 = = 36.82 50 + 100 − 2 The “pooled standard deviation” is 36.82! (b) Test the null hypothesis 1 = 2 against the alternative 1 > 2. Interpret the result of the test. [5] ....... (c) Quote the p-value for the test carried out in (b). How strong is the evidence that metering affects consumption? [3] ........ (d) Why do you think that metered households tend to consume less water than unmetered households? [5] Metered households tend to consume less water because they know the bigger consumption the bigger bills for water, contrary unmetered households are not care about size of water consumption. Quantitative Methods Formulae Sheet 1. Descriptive statistics X Mean: X Variance: S2 n i . ( X i X )2 n 1 X 2 i nX 2 n 1 . The standard deviation, S, is the square root of the variance. 2. The combinatorial formula n! . n Cr ( n r )! r! 3. Binomial probabilities p( X ) n C X p X ( 1 p ) n X X=0,1,2,...,n. 4. Bayes’ Rule P( A | B ) P( B | A )P( A ) . P( B | A )P( A ) P( B | A )P( A ) 5. Confidence Intervals and Hypothesis Tests (one sample) A 95% confidence interval for the population mean, , is given by: S . X t n 1,0.025 n To test H0: =0, use: t X 0 . S/ n The test statistic t has a t(n-1) distribution under H0. 6. The two-sample t-test X1 X2 t Sp 1 1 n1 n 2 where: ( n1 1 )S 1 ( n2 1 )S 2 . n1 n2 2 2 Sp 2 The test statistic t has a t(n1 + n2 - 2) distribution under H0: 1 = 2. Table 1: The standard normal distribution To find the area to the right of a number z, look down the left hand column for the first decimal place of z. Then look along the top row for the second decimal place. The number read from the centre of the table is the required area. .00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00 1.10 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90 2.00 2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 .00 .500 .460 .421 .382 .345 .309 .274 .242 .212 .184 .159 .136 .115 .097 .081 .067 .055 .045 .036 .029 .023 .018 .014 .011 .008 .006 .005 .003 .003 .002 .001 .01 .496 .456 .417 .378 .341 .305 .271 .239 .209 .181 .156 .133 .113 .095 .079 .066 .054 .044 .035 .028 .022 .017 .014 .010 .008 .006 .005 .003 .002 .002 .001 .02 .492 .452 .413 .374 .337 .302 .268 .236 .206 .179 .154 .131 .111 .093 .078 .064 .053 .043 .034 .027 .022 .017 .013 .010 .008 .006 .004 .003 .002 .002 .001 .03 .488 .448 .409 .371 .334 .298 .264 .233 .203 .176 .152 .129 .109 .092 .076 .063 .052 .042 .034 .027 .021 .017 .013 .010 .008 .006 .004 .003 .002 .002 .001 .04 .484 .444 .405 .367 .330 .295 .261 .230 .200 .174 .149 .127 .107 .090 .075 .062 .051 .041 .033 .026 .021 .016 .013 .010 .007 .006 .004 .003 .002 .002 .001 .05 .480 .440 .401 .363 .326 .291 .258 .227 .198 .171 .147 .125 .106 .089 .074 .061 .049 .040 .032 .026 .020 .016 .012 .009 .007 .005 .004 .003 .002 .002 .001 Critical values of the standard normal distribution P(Z > 1.282) = 0.10 P(Z > 1.645) = 0.05 P(Z > 1.960) = 0.025 P(Z > 2.326) = 0.01 P(Z > 2.576) = 0.005 .06 .476 .436 .397 .359 .323 .288 .255 .224 .195 .169 .145 .123 .104 .087 .072 .059 .048 .039 .031 .025 .020 .015 .012 .009 .007 .005 .004 .003 .002 .002 .001 .07 .472 .433 .394 .356 .319 .284 .251 .221 .192 .166 .142 .121 .102 .085 .071 .058 .047 .038 .031 .024 .019 .015 .012 .009 .007 .005 .004 .003 .002 .001 .001 .08 .468 .429 .390 .352 .316 .281 .248 .218 .189 .164 .140 .119 .100 .084 .069 .057 .046 .038 .030 .024 .019 .015 .011 .009 .007 .005 .004 .003 .002 .001 .001 .09 .464 .425 .386 .348 .312 .278 .245 .215 .187 .161 .138 .117 .099 .082 .068 .056 .046 .037 .029 .023 .018 .014 .011 .008 .006 .005 .004 .003 .002 .001 .001 df 1 2 3 4 5 Table 2: Critical values of the t-distribution = 0.10 = 0.05 = 0.025 = 0.01 = 0.005 3.08 1.89 1.64 1.53 1.48 6.31 2.92 2.35 2.13 2.02 12.71 4.30 3.18 2.78 2.57 31.82 6.97 4.54 3.75 3.37 63.66 9.93 5.84 4.60 4.03 6 7 8 9 10 1.44 1.42 1.40 1.38 1.37 1.94 1.90 1.86 1.83 1.81 2.45 2.37 2.31 2.26 2.23 3.14 3.00 2.90 2.82 2.76 3.71 3.50 3.36 3.25 3.17 11 12 13 14 15 1.36 1.36 1.35 1.35 1.34 1.80 1.78 1.77 1.76 1.75 2.20 2.18 2.16 2.15 2.13 2.72 2.68 2.65 2.62 2.60 3.11 3.06 3.01 2.98 2.95 16 17 18 19 20 1.34 1.33 1.33 1.33 1.33 1.75 1.74 1.73 1.73 1.73 2.12 2.11 2.10 2.09 2.09 2.58 2.57 2.55 2.54 2.53 2.92 2.90 2.88 2.86 2.85 21 22 23 24 25 1.32 1.32 1.32 1.32 1.32 1.72 1.72 1.71 1.71 1.71 2.08 2.07 2.07 2.06 2.06 2.52 2.51 2.50 2.49 2.49 2.83 2.82 2.81 2.80 2.79 26 27 28 29 30 1.32 1.31 1.31 1.31 1.31 1.70 1.70 1.70 1.70 1.70 2.06 2.05 2.05 2.04 2.04 2.48 2.47 2.47 2.46 2.46 2.78 2.77 2.76 2.76 2.75 40 50 60 70 80 90 1.30 1.30 1.30 1.29 1.29 1.29 1.68 1.68 1.67 1.67 1.66 1.66 2.02 2.01 2.00 1.99 1.99 1.99 2.42 2.40 2.39 2.38 2.37 2.37 2.70 2.68 2.66 2.65 2.64 2.63 100 125 150 200 1.29 1.29 1.29 1.29 1.28 1.66 1.66 1.65 1.65 1.64 1.98 1.98 1.98 1.97 1.96 2.36 2.36 2.35 2.35 2.33 2.63 2.62 2.61 2.60 2.58 Table of Chi-square statistics df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 P= 0.05 3.84 5.99 7.82 9.49 11.07 12.59 14.07 15.51 16.92 18.31 19.68 21.03 22.36 23.69 25 26.3 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 37.65 38.89 40.11 41.34 42.56 43.77 P= 0.01 6.64 9.21 11.35 13.28 15.09 16.81 18.48 20.09 21.67 23.21 24.73 26.22 27.69 29.14 30.58 32 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 44.31 45.64 46.96 48.28 49.59 50.89 P= 0.001 10.83 13.82 16.27 18.47 20.52 22.46 24.32 26.13 27.88 29.59 31.26 32.91 34.53 36.12 37.7 39.25 40.79 42.31 43.82 45.32 46.8 48.27 49.73 51.18 52.62 54.05 55.48 56.89 58.3 59.7 Df 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 P= 0.05 44.99 46.19 47.4 48.6 49.8 51 52.19 53.38 54.57 55.76 56.94 58.12 59.3 60.48 61.66 62.83 64 65.17 66.34 67.51 68.67 69.83 70.99 72.15 73.31 74.47 75.62 76.78 77.93 79.08 P= 0.01 52.19 53.49 54.78 56.06 57.34 58.62 59.89 61.16 62.43 63.69 64.95 66.21 67.46 68.71 69.96 71.2 72.44 73.68 74.92 76.15 77.39 78.62 79.84 81.07 82.29 83.52 84.73 85.95 87.17 88.38 P= 0.001 61.1 62.49 63.87 65.25 66.62 67.99 69.35 70.71 72.06 73.41 74.75 76.09 77.42 78.75 80.08 81.4 82.72 84.03 85.35 86.66 87.97 89.27 90.57 91.88 93.17 94.47 95.75 97.03 98.34 99.62 Table of Chi-square statistics df 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 P= 0.05 80.23 81.38 82.53 83.68 84.82 85.97 87.11 88.25 89.39 90.53 91.67 92.81 93.95 95.08 96.22 97.35 98.49 99.62 100.75 101.88 103.01 104.14 105.27 106.4 107.52 108.65 109.77 110.9 112.02 113.15 114.27 115.39 116.51 117.63 118.75 119.87 120.99 122.11 P= 0.01 89.59 90.8 92.01 93.22 94.42 95.63 96.83 98.03 99.23 100.42 101.62 102.82 104.01 105.2 106.39 107.58 108.77 109.96 111.15 112.33 113.51 114.7 115.88 117.06 118.24 119.41 120.59 121.77 122.94 124.12 125.29 126.46 127.63 128.8 129.97 131.14 132.31 133.47 P= 0.001 100.88 102.15 103.46 104.72 105.97 107.26 108.54 109.79 111.06 112.31 113.56 114.84 116.08 117.35 118.6 119.85 121.11 122.36 123.6 124.84 126.09 127.33 128.57 129.8 131.04 132.28 133.51 134.74 135.96 137.19 138.45 139.66 140.9 142.12 143.32 144.55 145.78 146.99 Df 91 92 93 94 95 96 97 98 99 100 P= 0.05 114.27 115.39 116.51 117.63 118.75 119.87 120.99 122.11 123.23 124.34 P= 0.01 125.29 126.46 127.63 128.8 129.97 131.14 132.31 133.47 134.64 135.81 P= 0.001 138.45 139.66 140.9 142.12 143.32 144.55 145.78 146.99 148.21 149.48