Class Notes 2019 Department of Statistics WST143 – Foundation Mathematical Statistics Name & Surname: _______________________________ Student number: ________________________________ Cellphone number: ______________________________ E-mail address: _________________________________ “Work hard in silence, let success make the noise” Copyright Reserved Table of Contents Introduction: Distributions .................................................................................................................................... 4 Chapter 6 – Continuous Probability Distributions .......................................................................................... 6 6.1 Continuous Uniform Probability Distribution ....................................................................................... 7 6.2 Normal Probability Distribution ................................................................................................................ 9 6.3 The F distribution ....................................................................................................................................... 18 Lab session component 1: Creating a graph of a probability density function in Excel ................ 19 L1.1: Creating and modifying charts ....................................................................................................................... 19 L1.2: Self-evaluation Exercise 1............................................................................................................................... 20 Lab session component 2: Probability......................................................................................................... 20 L2.1: Rand() and Randbetween() functions ............................................................................................................ 20 L2.2: Random number generator............................................................................................................................ 21 L2.3: Self-evaluation Exercise 2a............................................................................................................................. 22 L2.4: Normal probabilities and percentiles in Excel ................................................................................................ 24 L2.4.1: Normal probabilities.................................................................................................................................... 24 L2.4.2: Normal percentiles ...................................................................................................................................... 24 L2.5: Self-evaluation Exercise 2b ............................................................................................................................ 25 L2.6: Percentile estimates ....................................................................................................................................... 26 L2.7: Self-evaluation Exercise 2c ............................................................................................................................. 26 Chapter 6 Self Evaluation Questions ............................................................................................................ 27 Chapter 7: Sampling and Sampling Distributions ......................................................................................... 28 7.3 Point Estimation ......................................................................................................................................... 28 7.4 Introduction to Sampling Distributions ................................................................................................ 29 ฬ : ..................................................................................................................... 30 7.5 Sampling distribution of ๐ฟ ฬ ....................................................................................................................... 35 7.6 Sampling Distribution of ๐ Additional notes on absolute values ............................................................................................................ 38 Chapter 7 Self Evaluation Questions ............................................................................................................ 39 Chapter 8: Interval Estimation ............................................................................................................................ 44 8.1 Population mean: ๐ known ...................................................................................................................... 44 8.2 Population mean: ๐ unknown ................................................................................................................. 49 8.3 Determining the Sample Size .................................................................................................................. 54 8.4 Population proportion ............................................................................................................................... 55 Lab session component 3: Confidence intervals in Excel ...................................................................... 57 L3.1: Confidence intervals for the population mean (๐ known case) .................................................................... 57 L3.2: Confidence intervals for the population mean (๐ unknown case) ................................................................ 57 L3.3: Confidence intervals for the population proportion ...................................................................................... 58 L3.4: Self-evaluation Exercise 3............................................................................................................................... 58 Chapter 8 Self Evaluation Questions ............................................................................................................ 58 Chapter 9 Hypothesis tests ................................................................................................................................ 61 9.1 Developing the null and alternative hypotheses................................................................................ 61 Copyright Reserved 2 9.2 Type I and Type II Errors .......................................................................................................................... 62 9.3 Population mean: ๐ known ...................................................................................................................... 62 9.4 Population mean: ๐ unknown ................................................................................................................. 66 9.5 Population proportion ............................................................................................................................... 69 Lab session component 4: Hypothesis testing in Excel ......................................................................... 71 L4.1: Hypothesis tests for the population mean (๐ known case) ........................................................................... 71 L4.2: Self-evaluation Exercise 4............................................................................................................................... 72 Chapter 9 Self Evaluation Questions ............................................................................................................ 73 Chapter 10: Statistical inference about means with two populations ..................................................... 75 10.1 Inferences about the difference between two population means: ๐๐ and ๐๐ known ............. 75 10.2 Inferences about the difference between two population means: ๐๐ and ๐๐ unknown ........ 77 10.3 The Difference Between Two Population Means: Matched pairs ................................................ 80 10.4 The Difference Between Two Population Proportions ................................................................... 81 Chapter 10 Self Evaluation Questions .......................................................................................................... 83 Chapter 11: Statistical inferences about two population variances......................................................... 85 11.2 The difference between two population variances ......................................................................... 85 Hypothesis Testing Summary ............................................................................................................................ 87 Hypothesis Testing Tree Diagram .............................................................................................................................. 88 Cumulative probabilities for the standard normal distribution ................................................................. 89 Probability tables for the F distribution .................................................................................................................... 93 WST143 Formula list ............................................................................................................................................. 98 Optimisation Techniques ..................................................................................................................................... 99 Chapter 2: Differentiation ................................................................................................................................. 99 Chapter 3: Integration ..................................................................................................................................... 116 Expected values ............................................................................................................................................... 132 Moment Generating Functions ..................................................................................................................... 133 Solutions to Self Evaluation Questions ......................................................................................................... 137 Chapter 6 ............................................................................................................................................................ 137 Chapter 7 ............................................................................................................................................................ 139 Chapter 8 ............................................................................................................................................................ 141 Chapter 9 ............................................................................................................................................................ 142 Chapter 10 .......................................................................................................................................................... 143 Revision Exercise – Chapter 5.......................................................................................................................... 144 Revision Exercise – Chapter 5 – Solution ..................................................................................................... 145 Additional Exercises ........................................................................................................................................... 147 Chapter 6 ........................................................................................................................................................... 147 Chapter 7 ............................................................................................................................................................ 150 Chapter 8 ............................................................................................................................................................ 151 "6 months of genuine focus and alignment can put you 5 years ahead in life. Don't underestimate the power of consistency. You have what it takes to become the best. Harness your power. Exceed your expectation." Copyright Reserved 3 Introduction: Distributions What do we know? What about these????? Copyright Reserved 4 Let us investigate the following graphical presentation of the distribution of variables Discrete variable Continuous variable Consider the experiment of tossing two coins and let Plant scientists have developed a new variety of corn with increased amounts of protein. In a test to see what the effect on the growth of chickens is, an experimental group of 20 one-day-old male chicks was fed a ration containing the new corn. The following table summarises the weight gained (in grams) after 21 days (grouped data), as well as descriptive statistics calculated on the actual data set ๐ฟ = the number of heads p(x) 0.6 0.4 0.2 0 0 1 2 Is the above presentation the same as: Weight (in grams) (300 ; 330] (330 ; 360] (360 ; 390] (390 ; 420] (420 ; 450] (450 ; 480] Frequency (number of chicks) 2 1 2 9 4 2 Relative frequency 0.1 0.05 0.1 0.45 0.2 0.1 20 Draw any suitable graph to estimate the percentage of chickens that gained at least 435g. How could we have used the following histogram to answer the question? and… Could you have said an estimate of the percentage of chickens that gained at least 435g is 0.2 + 0.1? Why/why not? Why not? If we reason that 0.2 is the relative frequency of a chicken gaining anything between 420 and 450 grams AND that the relative frequency for it to gain between 420 and 435 grams IS THE SAME as the relative frequency to gain between 435 and 450 grams, what will a reasonable relative frequency be to gain between 435 and 450 grams? Copyright Reserved 5 Chapter 6 – Continuous Probability Distributions Introduction In WST133 we focused on discrete random variables and three discrete probability distributions, namely the binomial, discrete uniform and geometric distributions. We will now shift our attention to continuous random variables, their properties and a few continuous probability distributions. The fundamental difference between discrete and continuous random variables is the way in which probabilities are calculated. For a discrete random variable ๐, the probability mass function ๐(๐ฅ) is used to calculate the probability that the random variable takes on a specific value. On the other hand, for a continuous random variable ๐, the probability density function ๐(๐ฅ) does not directly provide probabilities. The probability that a random variable takes on a value in a specific interval of values, for example the interval [๐, ๐], can be calculated by finding the area under the function ๐(๐ฅ) between ๐ and ๐. This definition implies that the probability of a continuous random variable taking on a specific value is zero since the area under ๐(๐ฅ) at a particular point is zero. Comparison between discrete and continuous probability distributions: Discrete probability distribution The probability mass function ๐(๐ฅ) provides the probability that the random variable assumes a particular value Continuous probability distribution The probability density function ๐(๐ฅ) does NOT directly provide probabilities. 0 ≤ ๐(๐ฅ) ≤ 1 ๐(๐ฅ) ≥ 0 for all ๐ฅ, −∞ < ๐ฅ < ∞ for all ๐ฅ The AREA under the probability density function ๐(๐ฅ) for values of ๐: − ∞ < ๐ฅ < ∞ equals one. ∑ ๐(๐ฅ) = 1 ๐ฅ Class exercise: A supplier of paraffin has a 150๐ tank that is filled at the beginning of each week. His weekly demand shows a relative frequency behaviour that increases steadily up to 100๐ and then levels off between 100 and 150๐. If ๐ denotes the weekly demand in hundreds of litres, the relative frequency of demand can be modelled by ๐ฆ, ๐(๐ฆ) = {1, 0, a. b. c. d. ๐๐๐ 0 ≤ ๐ฆ ≤ 1 ๐๐๐ 1 ≤ ๐ฆ ≤ 1.5 ๐๐๐ ๐๐คโ๐๐๐ Represent ๐(๐) graphically Use geometry to verify that ๐(๐) is indeed a probability density function Find ๐ท(๐ ≤ ๐ ≤ ๐. ๐) Find ๐ท(๐. ๐ ≤ ๐ ≤ ๐. ๐) Copyright Reserved 6 6.1 Continuous Uniform Probability Distribution If ๐ < ๐, a random variable ๐ is said to have a continuous uniform probability distribution on the interval (๐, ๐) if and only if the density function of ๐ is: 1 ๐(๐ฅ) = {๐ − ๐ 0 for ๐ ≤ ๐ฅ ≤ ๐ elsewhere “Short hand”-notation: ๐ ~ ๐ข๐๐(๐, ๐) ๐+๐ 2 (๐ − ๐)2 ๐ฃ๐๐(๐) = 12 ๐ธ(๐) = Example (p 253): The flight time from Chicago to New York is uniformly distributed between 120 and 140 minutes. Let ๐ = the flight time (in minutes) of an airplane traveling from Chicago to New York. ๐ ~ ๐ข๐๐(120,140) 1 1 ๐(๐ฅ) = {140 − 120 = 20 0 for 120 ≤ ๐ฅ ≤ 140 elsewhere Note: ๐(๐ฅ) is called the probability density function Questions: 1. Calculate the probability that the flight time will be between 120 and 130 minutes. 1 1 ๐(120 < ๐ < 130) = ๐(120 ≤ ๐ ≤ 130) = โ๐ฅ โ ๐(๐ฅ) = (130 − 120) ( ) = (10) ( ) = 0.5 20 20 2. Calculate the probability that the flight time will be 125 minutes. ๐(๐ = 125) = 0 Note 1: ๐(๐ = ๐) = 0 for any value of ๐, a constant Note 2: ๐(๐ ≤ ๐ ≤ ๐) = ๐(๐ = ๐) + ๐(๐ < ๐ < ๐) + ๐(๐ = ๐) = 0 + ๐(๐ < ๐ < ๐) + 0 = ๐(๐ < ๐ < ๐) Therefore, ๐(๐ ≤ ๐ ≤ ๐) = ๐(๐ < ๐ < ๐). Copyright Reserved 7 3. Calculate the probability that the flight time will be between 125 and 150 minutes. ๐(125 < ๐ < 150) = ๐(125 < ๐ < 140) + ๐(140 < ๐ < 150) = โ๐ฅ โ ๐(๐ฅ) + โ๐ฅ โ ๐(๐ฅ) 1 = (15) ( ) + (10)(0) 20 = 0.75 4. The 75th percentile of ๐ is: ๐(120 < ๐ < ๐ฅ) = 0.75 1 (๐ฅ − 120) ( ) = 0.75 20 ๐ฅ − 120 = (0.75)(20) ๐ฅ = 135 ∴ ๐75 = 135 5. Calculate the expected value, variance and standard deviation of ๐: ๐ธ(๐) = ๐ + ๐ 120 + 140 = = 130 2 2 ๐๐๐(๐) = (๐ − ๐)2 (140 − 120)2 = = 33. 3ฬ 12 12 ๐๐ก๐๐๐ฃ(๐) = √33. 3ฬ = 5.77 Example: A random variable ๐ is uniformly distributed between 10 and 20. a) Sketch: b) ๐(๐ < 15) = โ๐ฅ โ ๐(๐ฅ) = (15 − 10)(0.1) = 0.5 c) ๐(12 < ๐ < 18) = โ๐ฅ โ ๐(๐ฅ) = (18 − 12)(0.1) = 0.6 d) ๐ธ(๐) = ๐+๐ 2 e) ๐๐๐(๐) = = 10+20 (๐−๐)2 12 2 = = 15 (20−10)2 12 = 8. 3ฬ f) ๐๐ก๐๐๐ฃ(๐) = √8. 3ฬ = 2.8868 More examples: Recommended material B2 (Williams et al, pg 256) – Exercises 1 – 6 Copyright Reserved 8 6.2 Normal Probability Distribution A random variable ๐ is said to have a normal probability distribution if and only if, for ๐ > 0 and −∞ < ๐ฆ < ∞ the density function of ๐ is ๐(๐ฆ) = ๐ ๐2 ๐ ๐ = = = = 1 ๐√2๐ −(๐ฆ−๐)2 ๐ (2๐2 ) , −∞ < ๐ฆ < ∞ ๐๐๐๐ ๐ฃ๐๐๐๐๐๐๐ 3.1416 2.7183 “Short-hand” notation: ๐~๐(๐, ๐ 2 ) Characteristics: 1. The entire family for normal probability distributions can be told apart by their means and standard deviations/variances. 2. The highest point on the normal curve is at the: (i) Mean (๐) (ii) Median (๐50 ) (iii) Mode 3. The mean ๐ can be any numerical value: 4. It is symmetric around ๐, and the tails of the curve extend to infinity in both directions and theoretically never touch the horizontal axis. 5. Larger ๐ ๏ฐ larger variability ๏ฐ flatter curves. 6. a. The total area under the curve is 1. b. The total area under the curve to the left of ๐ is always 0.5. c. The total area under the curve to the right of μ is always 0.5. Note: Since the normal probability distribution is symmetric, the empirical rule is valid. In fact, the empirical rule can be derived from the standard normal distribution. Try to do this. Copyright Reserved 9 Standard normal probability distribution If ๐ is normally distributed with ๐ = 0 and ๐ = 1, then ๐ is said to have a standard normal distribution. ๐= ๐−๐ ~ ๐(0, 1) ๐ Tables for the Standard Normal probability distribution are available at the end of the notes Copyright Reserved 10 Example: Exercise 17 (B2 – pg 271) Copyright Reserved 11 Using Excel to Compute Standard Normal Probabilities Syntax: NORM.S.DIST(๐, cumulative) - where ๐ง is the z-value for which you would like to calculate either the probability to the left (if cumulative = “True”) or the value of the density function (if cumulative = “False”). Note that this function only applies to the standard normal distribution. Examples: 1. ๐(๐ < 1) = ๐(๐ ≤ 1) = =NORM.S.DIST(1,TRUE) Answer: 0.8413 2. ๐(0 < ๐ < 1) = ๐(0 ≤ ๐ ≤ 1) = ๐(๐ < 1)– ๐(๐ < 0) = =NORM.S.DIST(1,TRUE)-NORM.S.DIST(0.TRUE) Answer: 0.3413 3. ๐(−1 < ๐ < 1) = ๐(−1 ≤ ๐ ≤ 1) = ๐(๐ < 1)– ๐(๐ < −1) = =NORM.S.DIST(1,TRUE)-NORM.S.DIST(-1,TRUE) Answer: 0.6827 4. ๐(๐ > 1.58) = ๐(๐ ≥ 1.58) = 1– ๐(๐ < 1.58) = =1-NORM.S.DIST(1.58,TRUE) Answer: 0.0571 Copyright Reserved 12 5. ๐(๐ < −0.498) = ๐(๐ < −0.50) = Note: -0.498 is rounded to -0.50, but -0.492, for example, would be rounded to -0.49. =NORM.S.DIST(-0.498,TRUE) Answer: 0.3092 6. ๐(๐ > 1.47) = 1– ๐(๐ < 1.47) = =1-NORM.S.DIST(1.47,TRUE) Answer: 0.0708 7. ๐(๐ < −1.47) = =NORM.S.DIST(-1.47,TRUE) Answer: 0.0708 Note: The answers to questions 6 and 7 are the same due to symmetry. 8. ๐(๐ < −3.3) = ๐(๐ ≤ −3.3) ≈ 0 Recall that we have an outlier when ๐ง < −3 or ๐ง > 3. Therefore, the probability that ๐ง is less than −3.3 is approximately zero. =NORM.S.DIST(-3.3,TRUE) Answer: 0.000483 Copyright Reserved 13 The Inverse Standard Normal distribution Given: The area under the curve Calculate: z-value (i.e. a percentile for a standard normal random variable) Syntax: NORM.S.INV(probability) – where the probability that is given to the function is the area under the curve to the left of the required z-value. Note that this function returns a value from the standard normal distribution. Examples: 1. The area to the left of ๐ง is 0.6331. =NORM.S.INV(0.6331) 2. Calculate the z-value so that the probability to get a larger z-value is 0.1. =NORM.S.INV(0.9) 3. Answer: 0.3401 Answer: 1.28155 The area to the right of ๐ง is 0.119. =NORM.S.INV(1-0.119) OR =NORM.S.INV(0.881) 4. Answer: 1.1800 The area to the left of ๐ง is 0.33. =NORM.S.INV(0.33) Answer: -0.4399 Copyright Reserved 14 Percentiles Recall that in Chapter 3 of the textbook we calculated the ๐๐กโ percentile for a sample of size ๐ by first calculating the position of the percentile. This was done by making use of the formula ๐ = ๐ (100) ๐. When finding percentiles of a random variable from a normal distribution we will however follow a different procedure. If the 5th percentile is mentioned (for example) then we know that the area to the left of the point is 0.05. If the 95th percentile is mentioned (for example) then we know that the area to the left of the point is 0.95. Once we know what the area to the left of a point is, we can get the corresponding z-value. We can then calculate the percentile using this z-value. The corresponding z-value to the 5th percentile: The corresponding z-value to the 95th percentile: Example: Suppose ๐ = 100 and ๐ = 5. For the 5th percentile: ๐ง = ๐ฅ−๐ For the 95th percentile: ๐ง = ๐ , ∴ −1.645 = ๐ฅ−๐ ๐ , ∴ 1.645 = ๐ฅ−100 , ∴ ๐ฅ = (−1.645)(5) + 100 = 91.775. 5 ๐ฅ−100 5 , ∴ ๐ฅ = (1.645)(5) + 100 = 108.225. These ๐ฅ values are the 5th and 95th percentiles respectively. Copyright Reserved 15 Computing probabilities for any Normal Probability Distribution: ๐ = number of miles a set of tires will last Given: i. Data is normally distributed ii. ๐ = 36 500 miles iii. ๐ = 5 000 miles Question 1: Calculate the probability that a rear tire will not last more than 20 000 miles: Answer: ๐ฅ−๐ 20 000−36 500 First we standardize the ๐ฅ value: ๐ = ๐ = = −3.3. 5 000 Therefore, ๐(๐ < 20 000) = ๐(๐ < −3.3) ≈ 0 (using the properties of outliers) Using Excel: ๐(๐ < 20 000) = Excel: =NORM.DIST(20000, 36500, 5000, TRUE) Answer: 0.0005 Question 2: What percentage of the tires can be expected to last more than 40 000 miles? Answer: ๐ฅ−๐ 40 000−36 500 First we standardize the ๐ฅ value: ๐ง = ๐ = = 0.7. 5 000 Therefore, ๐(๐ > 40 000) = ๐(๐ > 0.70) = 1 − ๐(๐ < 0.70) = 1 − 0.7580 = 0.242 Using Excel: ๐(๐ > 40 000) = 1 − ๐(๐ < 40 000) = Excel: = 1 – NORM.DIST(40000, 36500, 5000, TRUE) Answer: 0.24196 Copyright Reserved 16 Question 3: Calculate the probability that a tire’s lifetime is between 20 000 and 40 000 miles: Answer: First we standardize the ๐ฅ values: ๐ง= ๐ฅ−๐ ๐ = 40 000−36 500 5 000 = 0.7 and ๐ง = ๐ฅ−๐ ๐ = 20 000−36 500 5 000 = −3.3 Therefore, ๐(20 000 < ๐ < 40 000) = ๐(−3.3 < ๐ < 0.70) = ๐(๐ < 0.70) − ๐(๐ < −3.3) = 0.7580 − 0.0005 = 0.7575 Using Excel: ๐(20 000 < ๐ < 40 000) = ๐(๐ < 40 000) − ๐(๐ < 20000) = Excel: = NORM.DIST(40000, 36500, 5000, TRUE) – NORM.DIST(20000, 36500, 5000, TRUE) Answer: 0.7576 Question 4: How long must the guarantee period be so that less than 2.5% of the tires that are under guarantee will be replaced? Answer: If we know the area to the left of ๐ง equals 0.025, we can find the corresponding z-value: ๐= ๐−๐ ๐ , ∴ −1.96 = ๐ฅ−36 500 5 000 Now we solve for ๐ฅ : ๐ฅ = (−1.96)(5 000) + 36 500 = 26 700 Using Excel: Excel: =NORM.INV(0.025, 36500, 5000) Answer: 26700.19 Copyright Reserved 17 Question 5: Compute the minimum tire mileage for the top 2.5% of rear tires. Answer: If we know the area to the left of ๐ง equals 0.975, we can find the corresponding z-value: ๐= ๐−๐ ๐ , ∴ 1.96 = ๐ฅ−36 500 5 000 Now we solve for ๐ฅ : ๐ฅ = (1.96)(5 000) + 36 500 = 46 300 Using Excel: Excel: =NORM.INV(0.975,36500,5000) Answer: 46299.81 6.3 The F distribution The F distribution is a positively skewed distribution that can only take on positive values. The shape of the distribution is fully defined by two parameters, namely ๐1 and ๐2 , known respectively as the numerator and denominator degrees of freedom. Notes: ๏ท The following special relationship exists for F values ๐น๐1 ,๐2 ;๐ผ = ๏ท 1 ๐น๐2 ,๐1;1−๐ผ The F distribution has important underlying assumptions. We however will not consider these assumptions and will only focus on the application of the distribution to testing hypotheses regarding the variances of two independent samples. The relevant assumptions for this test are stated in Chapter 11. Copyright Reserved 18 Lab session component 1: Creating a graph of a probability density function in Excel Outcomes: At the end of this section you should be able to ๏ท create a graph of a given function in Excel, and ๏ท use the graph created in Excel to verify that a function is a valid probability density function, and ๏ท be able to create graphs of probability density functions for illustration purposes in your project work assignments. L1.1: Creating and modifying charts A graph of a probability density function can be created in Excel by following a few simple steps. The process will be explained using an example. Consider the following probability density function: 3 2 0≤๐ฆ≤1 ๐(๐ฆ) = {2 ๐ฆ + ๐ฆ, 0, ๐๐๐ ๐๐คโ๐๐๐ In order to plot the density function, we need to find some points that lie on the density curve. In our example, the density function is defined for values of y in the interval [ 0,1 ] . We therefore need to calculate values of the density function at points within this interval. The calculation of these values at a few selected points is shown in Figure 1. (a) Formulae used to calculate points on the density curve. (b) Calculated points on the density curve. Figure 1: Calculation of points needed to plot the density curve. After calculating points on the density curve, we now need to decide whether to use smooth or straight lines to draw the density curve. From the form of the function itself, it is clear that the density curve is not a straight line and should be drawn using a smooth line. This can be done by choosing the ‘Smooth Lines’ option under the ‘Scatter’ option. The resulting graph is shown in Figure fig: smooth lines and density function curve. It is important to note that the ‘Straight Lines’ option should be used if the density function is linear. If the density function is curved, a more accurate graph can be obtained by plotting more points. Figure 2: The ‘Smooth Lines’ option and plotted density curve. Copyright Reserved 19 L1.2: Self-evaluation Exercise 1 Use Excel to plot the following functions, then use geometry to verify that the functions are density functions. 1. ๐ฆ, ๐(๐ฆ) = {1, 0, 0≤๐ฆ≤1 1 < ๐ฆ < 1.5 ๐๐๐ ๐๐คโ๐๐๐ 2. ๐(๐ฆ) = { 6๐ฆ(1 − ๐ฆ), 0, 0≤๐ฆ≤1 ๐๐๐ ๐๐คโ๐๐๐ 3. ๐(๐ฆ) = { 6๐ฆ(1 − ๐ฆ), 0, −1 ≤ ๐ฆ ≤ 2 ๐๐๐ ๐๐คโ๐๐๐ 4. ๐(๐ฆ) = { ๐ฆ(1 − ๐ฆ) + 1, 0, −1 ≤ ๐ฆ ≤ 2 ๐๐๐ ๐๐คโ๐๐๐ Lab session component 2: Probability Outcomes: At the end of this section you should be able to ๏ท generate ‘observations’ using the rand() and randbetween() functions, ๏ท generate ‘observations’ from different distributions using Excel's Random Number Generator, ๏ท calculate empirical probabilities and percentiles for samples generated using Excel's Random Number Generator, ๏ท calculate normal probabilities for values using Excel's norm.s.dist() and norm.dist() functions, ๏ท plot the probability density function of a ๐(๐ , ๐ 2 ) distribution, ๏ท calculate normal percentiles using Excel's norm.s.inv() and norm.inv() functions. and ๏ท use the norm.s.dist(), norm.dist(), norm.s.inv() and norm.inv() to calculate critical values and p-values in the context of hypothesis testing for your project work assignments. L2.1: Rand() and Randbetween() functions You should already be familiar with the rand() function that was introduced to you in the WST 133 Practical Guide. This function randomly generates values between 0 and 1. However, if we are interested in generating random integers between two specified values, we cannot use this function without making some adjustments. An easier way of accomplishing this is by making use of the randbetween() function. In this function we need to specify the minimum and maximum values that we want generated. Example formula What it does =randbetween(0,10) Generates random integers between 0 and 10, inclusive =randbetween(0,10) Generates random integers between 5 and 50, inclusive Once you have generated the necessary values, remember to ‘lock’ the numbers by selecting ‘Manual’ from the ‘Calculation Options’ button in the ‘Formulas’ tab. If you do not do this, the values will change each time you change something in Excel. Copyright Reserved 20 L2.2: Random number generator The Random Number Generation analysis tool fills a range of cells with independent random numbers that are drawn from one of several distributions. You can characterize the subjects in a population with a probability distribution. For example, you can use a normal distribution to characterize the population of individuals' heights, or you can use a Bernoulli distribution of two possible outcomes to characterize the population of coin-flip results. We will not focus on the theory of the different distributions in this guide. For now we only want to become familiar with how the Random Number Generator works. We will explain this by making use of an example. Say we would like to generate 10 values from a Normal (bell-shaped) distribution with mean 0 and variance 1. 1. On the ‘Data’ tab, in the ‘Analysis’ group, click ‘Data Analysis’. 2. In the ‘Analysis Tools’ box, click ‘Random Number Generator’, and then click OK. On the ‘Data’ tab, in the ‘Analysis’ group, click ‘Data Analysis’. The dialog box that subsequently appears is shown below. 3. The ‘Number of Variables’ box can be interpreted as the number of samples we would like to generate. In this case, we are only interested in generating 1 sample of size 10. We therefore enter ‘1’ in this box. 4. The ‘Number of Random Numbers’ box can be interpreted as the number of observations we are interested in. In this case we enter ‘10’. 5. Next we need to choose the appropriate distribution from which to generate values. Our options here include ‘Uniform’, ‘Bernoulli’, ‘Binomial’, ‘Normal’, ‘Poisson’, ‘Patterned’ and ‘Discrete’. Each one of these options require specific parameters to be entered. For now, we will tell you which distribution to use as well as the required parameters. For our example, choose ‘Normal’ with mean ‘0’ and standard deviation ‘1’. 6. The ‘Random Seed’ value is optional. If we leave this box empty, the numbers that are generated will be completely random. If we however specify a seed value, the resulting random numbers can be obtained again in future by specifying the same parameters and seed value again. To ensure that you obtain the same random numbers as us, specify a seed value of ‘5’. 7. The ‘Output options’ are the same as we had with the ‘Histogram’ tool in WST133. 8. Click on ‘OK’. The final view of the dialog box as well as the random numbers obtained is shown on the next page. Copyright Reserved 21 If we wanted to generate 2 samples of size 10 each, sample 1 would have been placed in column A while sample 2 would be placed in column B. Play around with this generator and see how changing different settings will affect your output. L2.3: Self-evaluation Exercise 2a 1. Use the randbetween() function to generate 30 values between 0 and 5. Create a bar chart to show how the values are distributed. 2. Use the ‘Random Number Generator’ to generate 10, 100, 1000 and 10000 values from a normal distribution with a mean of 10 and a variance of 16. Repeat this process using seed values of 26, 52, 15 and 8. In each case, use the ‘Histogram’ tool to create a frequency distribution and a chart of the observations. Comment on what you observe. 3. For each of the samples generated in question 2, calculate the mean and variance. Comment on what you observe 4. For each of the samples generated in question 2, calculate the percentage of observations that lie within one, two, three and four standard deviation of the mean. Complete the following tables and comment on what you observe. Copyright Reserved 22 Seed = 26 ๐ Number of standard deviations 1 2 3 4 10 100 1000 10000 Seed = 52 ๐ Number of standard deviations 1 2 3 4 10 100 1000 10000 Seed = 15 ๐ Number of standard deviations 1 2 3 4 10 100 1000 10000 Seed = 8 ๐ Number of standard deviations 1 2 3 4 10 100 1000 10000 Copyright Reserved 23 L2.4: Normal probabilities and percentiles in Excel L2.4.1: Normal probabilities Given a value from a normal distribution, Excel can calculate the probability of obtaining an observation smaller than the specified value. When dealing with a standard normal distribution, this is done by making use of the norm.s.dist() function while the norm.dist() function is used for all other normal distributions. The syntax for the norm.s.dist() function is given by NORM.S.DIST(z, cumulative) where z is the value for which you want the probability and cumulative can be set to either TRUE or FALSE . Example: Let ๐~๐(0,1). Then the ๐(๐ < 2.16) is calculated using the Excel code norm.s.dist(2.16,TRUE) as 0.98461. The value of the density function at the point 2.16 is calculated by norm.s.dist(2.16, FALSE) as 0.03871. When cumulative is set to FALSE, the norm.s.inv() function can therefore be used to plot the density function. The syntax for the norm.dist() function is given by NORM.DIST(x,mean,standard_dev,cumulative) where x is the value for which you want the probability, mean and standard_dev specify the parameters of the required normal distribution and cumulative can again be set to either TRUE or FALSE. L2.4.2: Normal percentiles Percentiles for the normal distribution can be calculated in Excel using the norm.s.inv() function in the case of the standard normal distribution, and the norm.inv() function in the case of all other normal distributions. The syntax for the norm.s.inv() function is given by NORM.S.INV(probability) where probability specifies the percentile to be calculated. Example: Let ๐~๐(0,1). Then the 80th percentile is calculated using the Excel code norm.s.inv(0.8) as 0.84162 . The syntax for the norm.inv() function is given by NORM.INV( probability,mean,standard_dev ) where probability specifies the percentile to be calculated and mean and standard_dev specify the parameters of the required normal distribution. Figure 3: Summary of the norm.s.dist(), norm.dist(), norm.s.inv() and norm.inv() functions. Copyright Reserved 24 L2.5: Self-evaluation Exercise 2b 1. ๏ท ๏ท ๏ท ๏ท Plot the density function for the ๐~๐(0,1). Using the appropriate Excel functions, calculate ๐(๐ < 3.67). [Solution: 0.999879] Using the appropriate Excel functions, calculate ๐(๐ > −1.43). [Solution: 0.923641] Using the appropriate Excel functions, calculate ๐( −1.75 < ๐ < 0.89). [Solution: 0.773208] Using the appropriate Excel functions, calculate the 70th percentile. [Solution: 0.524401] 2. Plot the density function for the ๐~๐(25,25). ๏ท Using the appropriate Excel functions, calculate ๐(๐ < 23.5). [Solution: 0.382089] ๏ท Using the appropriate Excel functions, calculate the 1 7 th percentile. [Solution: 20.229174] 3. Plot the density function for the ๐~๐(25,5). ๏ท Using the appropriate Excel functions, calculate ๐(๐ < 23.5). [Solution: 0.251167] ๏ท Using the appropriate Excel functions, calculate the 17th percentile. [Solution: 22.866422] (a) ๐~๐(25,25) (b) ๐~๐(25,5) Figure 4: Graphs for the density functions given in questions 2 and 3 Copyright Reserved 25 L2.6: Percentile estimates In Section L2.4.2 we saw how Excel can be used to calculate the theoretical percentiles of a variable with a normal distribution. We will now explore how Excel can be used to calculate empirical percentiles when we have a sample of observations available. The ‘Empirical Data 2018.xlsx’ file (available on ClickUP) will be used to explain this application. In the worksheet labeled ‘Raw Data’, 250 samples of size 50 are given. These values were generated from a ๐๐๐๐(10,25) distribution. The sample averages (means) were calculated for each of the different samples. In Section 7.5 you will learn that ๐ฬ ~๐(๐๐ , ๐ 2 ⁄๐) since ๐ > 30. We can calculate the theoretical parameters of our original variable ๐ as ๐๐ = ๐ + ๐ 25 − 10 = = 17.5 2 2 and ๐๐2 = (๐ − ๐)2 (25 − 10)2 225 = = = 18.75 12 12 12 Using these values we can now calculate the theoretical distribution of ๐ฬ . Theoretically, we know that the 75th percentile of ๐ฬ can be estimated by calculating the 75th percentile of a ๐(17.5,0.375) random variable. Using the norm.inv() function in Excel we find that ๐ฅฬ 0.75 ≈ ๐๐๐๐. ๐๐๐ฃ(0.75,17.5, ๐๐๐ ๐(0.375)) = 17.913039 This percentile can also be estimated from the sample data. We will do this by making use of the percentile.inc() function. The syntax for this is given as PERCENTILE.INC(array,k) where array is the range containing the different ๐ฅฬ sample values and k, 0 ≤ ๐ ≤ 1, is the ๐(100)๐กโ percentile. To get an estimate ๐ฅฬ ฬ0.75 for the 75th percentile of ๐ฬ using the data in the worksheet labelled ‘Practical 4 data’, we use the following Excel code: ๐ฅฬ ฬ0.75 ≈ ๐๐ธ๐ ๐ถ๐ธ๐๐๐ผ๐ฟ๐ธ. ๐ผ๐๐ถ(๐ต: ๐ต, 0.75) = 17.90420164 It is clear that the theoretical and empirical estimates are very close to each other. As the sample size ๐ increases, the empirical estimate should approach the theoretical estimate of the percentile. L2.7: Self-evaluation Exercise 2c Open the file `Self Evaluation Exercise 2c Data.xlsx'. The values given in this file were generated from a normal distribution with mean 10 and standard deviation 5. Various empirical and theoretical values were calculated for this data and the results are given below. Check that you are able to obtain the same results. Copyright Reserved 26 Chapter 6 Self Evaluation Questions Questions 1 to 5 are based on the following information: The time spent waiting in queues (in minutes) to buy tickets for a soccer match is uniformly distributed between 25 and 40 minutes. Let ๐ = time (in minutes) spent in queues. 1. The probability density function of ๐ is: 1 , ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 a) ๐(๐ฅ) = {15 0, ๐๐๐ ๐๐คโ๐๐๐ 0 , ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 b) ๐(๐ฅ) = { 1 , ๐๐๐ ๐๐คโ๐๐๐ 15 1 , c) ๐(๐ฅ) = {65 0, 0 , d) ๐(๐ฅ) = { 1 , 65 ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 ๐๐๐ ๐๐คโ๐๐๐ ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 ๐๐๐ ๐๐คโ๐๐๐ 1 , ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 e) ๐(๐ฅ) = {45 0, ๐๐๐ ๐๐คโ๐๐๐ 2. Calculate the 75th percentile of ๐. 3. Calculate the variance of ๐ . 4. The probability that the time (in minutes) spent in queues is more than 22 minutes is? 5. The probability that the time (in minutes) spent in queues is between 27 and 36 minutes is? 6. In the following probability statement, ๐(๐ > ๐) = 0.95, the value ๐ represents: Given: ๐ is a standard normal random variable. a) b) c) d) e) the median of the standard normal distribution. the 5th percentile of the standard normal distribution. the 95th percentile of the standard normal distribution. the standard error of the standard normal distribution. the 95th value of the standard normal distribution. 7. The IQs of people are normally distributed with an average of 100 and a standard deviation of 15. Calculate the 90th percentile of the IQ values. Hint: The value of NORM.S.INV(0.1) in Excel is -1.282. Copyright Reserved 27 Chapter 7: Sampling and Sampling Distributions 7.1 – 7.2 Revision of Semester 1 Population parameter Sample statistic (Point Estimator) Sampling error Mean ∑ ๐๐ ๐= ๐ ∑ ๐๐ ๐= ๐ |๐ฬ − ๐| OR |๐ − ๐| Variance 7.3 Point Estimation ∑(๐๐ − ๐)2 ๐2 = ๐ ∑(๐๐ − ๐) ๐2 = ๐−1 |๐ 2 − ๐ 2 | OR |๐ 2 − ๐ 2 | Standard deviation ๐ = √๐ 2 ๐ = √๐ 2 |๐ − ๐| OR |๐ − ๐| Proportion 2 ๐ ๐ ๐= ๐ |๐ − ๐| OR |๐ − ๐| Example: The life expectancy (in years) of 10 VCRs are as follow: 6.5 8.0 6.2 7.4 7.0 8.4 9.5 4.6 5.0 7.4 What is the point estimate of the population average for the life time of the VCRs? ๐ฅ= ∑ ๐ฅ๐ ๐ = 6.5+8.0+6.2+7.4+7.0+8.4+9.5+4.6+5.0+7.4 10 70 = 10 = 7 What is the point estimate of the population standard deviation for the life time of the VCRs? ∑(๐ฅ๐ −๐ฅ)2 ๐ =√ ๐−1 = 1.497 What is the point estimate of the population proportion for VCRs with a life time of more than 5 years? 8 ๐ = 10 = 0.8 Note: Remember that random variables/point estimators are indicated with capital letters and calculated statistics/point estimates with lower case letters. Copyright Reserved 28 7.4 Introduction to Sampling Distributions Notes Copyright Reserved 29 ฬ : 7.5 Sampling distribution of ๐ฟ Important note: Let ๐ฬ be the sample average of a random sample of size ๐ from a Normal distribution. Then: The sampling distribution of ๐ฬ has a Normal distribution for all ๐. Central limit theorem: Let ๐ฬ be the sample average of a random sample of size ๐ from any population. Then: The sampling distribution of ๐ฬ has an approximate Normal distribution for ๐ large (๐ ≥ 30). Note that the original population can be discrete or continuous. The expected value of ๐ฬ : ∑๐ ∑ ๐ธ(๐) ๐๐ ๐ธ(๐ฬ ) = ๐ธ ( ๐ ) = ๐ = ๐ = ๐ The sample average (๐ฬ ) is an unbiased estimator of the population mean (๐) since ๐ธ(๐ฬ ) = ๐๐ฬ = ๐ The standard deviation of ๐ฬ : Finite population: ๐−๐ ๐ ๐๐ฬ = √๐−1 ( ๐) √ ๐๐ฬ Infinite population: ๐๐ฬ = (A) ๐ √๐ (B) is also known as the standard error of the mean. ๐−๐ ๐ Note: For ๐ large and ๐ small, then √๐−1 ≈ 1 and we use the formula (B). If ๐ ≤ 0.05 we use formula (B). Copyright Reserved 30 Example 1: Salary of managers: ๐ = 51 800 and ๐ = 4 000. Note: This notation implies that the average salary of all managers (i.e. the population average) is 51 800 with a (population) standard deviation of 4000. Question: Calculate the probability that ๐ฬ will be within $500 from the population average for a sample of size 100. Answer: ฬ ) = ๐ = 51 800 ๐ธ(๐ ๐๐ฬ = ๐ √๐ = 4 000 √100 = 400 First we standardize the ๐ฅ values: ๐ง= ๐ฅ−๐ ๐๐ ฬ = 51 300−51 800 400 = −1.25 and ๐ง = ๐ฅ−๐ ๐๐ ฬ = 52 300−51 800 400 = 1.25 Therefore, ฬ < 52 300) = ๐(−1.25 < ๐ < 1.25) ๐(51 300 < ๐ = ๐(๐ < 1.25) − ๐(๐ < −1.25) = 0.8944 − 0.1056 = 0.7888 Answer using Excel’s NORM.S.DIST function: ๐(−1.25 < ๐ < 1.25) = ๐(๐ < 1.25) − ๐(๐ < −1.25) = ๐๐๐ ๐. ๐. ๐ท๐ผ๐๐(1.25, ๐ป๐น๐ผ๐ฌ) – ๐๐๐ ๐. ๐. ๐ท๐ผ๐๐(−1.25, ๐ป๐น๐ผ๐ฌ) = 0.7887 Answer using Excel’s NORM.DIST function: ฬ < 52 300) = ๐(๐ ฬ < 52 300) − ๐(๐ ฬ < 51 300) ๐(51 300 < ๐ = ๐๐๐ ๐. ๐ท๐ผ๐๐(52300, 51800, 400, ๐๐ ๐๐ธ)– ๐๐๐ ๐. ๐ท๐ผ๐๐(51300, 51800, 400, ๐๐ ๐๐ธ) = 0.7887 IN GENERAL: = NORM.DIST( ๐ฅ, ๐, ๐๐ฬ , TRUE) Copyright Reserved 31 Example 2 Suppose that the delivery time of pizzas has a uniform distribution over the interval from 20 to 40 minutes. Let ๐ = the delivery time in minutes. 1. The probability function of ๐ is: 1 1 = ๐(๐ฅ) = {40 − 20 20 = 0.05 0 for 20 ≤ ๐ฅ ≤ 40 elsewhere 2. The average delivery time of a pizza is: ๐ธ(๐) = ๐ = ๐ + ๐ 20 + 40 = = 30 2 2 3. The standard deviation of ๐ is: ๐๐๐(๐) = ๐ 2 = (๐ − ๐)2 (40 − 20)2 = = 33. 3ฬ 12 12 ๐๐ก๐๐๐ฃ(๐) = ๐ = √33. 3ฬ = 5.7735 4. The probability that it will take between 28 and 32 minutes to deliver a pizza is: ๐(28 < ๐ < 32) = โ๐ฅ โ ๐(๐ฅ) = (32 − 28)(0.05) = 0.2 5. Let ๐ฬ = the average delivery time of 36 pizzas (a) Give the sampling distribution of ๐ฬ . ฬ ) = ๐ = 30 ๐ธ(๐ ๐๐ = ๐ √๐ = 5.7735 √36 = 0.962 Copyright Reserved 32 (b) The probability that ๐ฬ is between 28 and 32 minutes is: First we standardize the ๐ values: ๐ง= ๐ฅ−๐ ๐๐ ฬ = 28−30 0.962 = −2.078 ≈ −2.08 and ๐ง = ๐ฅ−๐ ๐๐ ฬ = 32−30 0.962 = 2.078 ≈ 2.08 Therefore, ๐(28 < ๐ฬ < 32) = ๐(−2.08 < ๐ < 2.08) = ๐(๐ < 2.08) − ๐(๐ < −2.08) = 0.9812 − 0.0188 = 0.9624 Answer using Excel’s NORM.S.DIST function: ๐(−2.08 < ๐ < 2.08) = ๐(๐ < 2.08) − ๐(๐ < −2.08) = NORM.S.DIST(2.08,TRUE) – NORM.S.DIST(-2.08,TRUE) = 0.9623 Answer using Excel’s NORM.DIST function: ๐(28 < ๐ < 32) = ๐(๐ < 32) − ๐(๐ < 28) = NORM.DIST(32, 30, 0.962, TRUE) – NORM.DIST(28, 30, 0.962, TRUE) = 0.9623 Copyright Reserved 33 Example 3 Question: ๐ is normally distributed with ๐ = 60 and ๐ = 10. If ๐(๐1 < ๐ < ๐2 ) = 0.95 and ๐ = 100, what are the values of ๐1 and ๐2 ? Note that the area of 0.95 represents the middle 95% of the data. Answer: ๐๐ฬ = ๐⁄ = 10⁄ =1 √๐ √100 If we know that the area to the left of ๐1 is 0.025, we can find the corresponding z-value of -1.96. If we know that the area to the left of ๐2 is 0.975, we can find the corresponding z-value of 1.96. Now to find ๐1 and ๐2 : ๐ง= ๐1 − ๐ ๐๐ฬ −1.96 = ๐ง= ๐1 − 60 1 ๐2 − ๐ ๐๐ฬ 1.96 = ๐1 = (−1.96)(1) + 60 = 58.04 ๐2 − 60 1 ๐2 = (1.96)(1) + 60 = 61.96 Answer using Excel: We know that the area to the left of ๐1 is 0.025, therefore: ๐1 = NORM.INV(0.025, 60, 1) = 58.04 We know that the area to the left of ๐2 is 0.975, therefore: ๐2 = NORM.INV(0.975, 60, 1) = 61.96 IN GENERAL: =NORM.INV(area to the left, ๐, ๐๐ฬ ) Copyright Reserved 34 7.6 Sampling Distribution of ๐ Expected value of ๐: ๐ธ(๐) = ๐ The sample proportion (๐) is an unbiased estimator of the population proportion (๐) since ๐ธ(๐) = ๐ Standard deviation of ๐: Finite population: Infinite population: ๐ − ๐ ๐(1 − ๐) √ ๐๐ = √ ๐−1 ๐ ๐(1 − ๐) ๐๐ = √ ๐ ๐๐ is also known as the standard error of the proportion. The sampling distribution of ๐ is approximately normally distributed for “large” samples. A sample is “large” if ๐๐ ≥ 5 ๐(1 − ๐) ≥ 5 This means that we can standardize ๐ฬ as follows: ๐= ๐ฬ − ๐ธ(๐ฬ ) = ๐๐ฬ ๐ฬ − ๐ √๐(1 − ๐) ๐ Copyright Reserved 35 Example 1: Proportion of managers that participated in the training program is: ๐= 1 500 = 0.6 2 500 Question: Calculate the probability that ๐ is within 0.05 of the population proportion for ๐ = 30. Answer: ๐ธ(๐) = ๐ = 0.6 ๐(1−๐) ๐๐ = √ ๐ 0.6(1−0.6) =√ 30 = 0.089 First we standardize the values: ๐ง= ๐−๐ ๐๐ = 0.55−0.6 0.089 = −0.559 ≈ −0.56 and ๐ง = ๐−๐ ๐๐ = 0.65−0.6 0.089 = 0.559 ≈ 0.56 Therefore, ๐(0.55 < ๐ < 0.65) = ๐(−0.56 < ๐ < 0.56) = ๐(๐ < 0.56) − ๐(๐ < −0.56) = 0.7123 − 0.2877 = 0.4246 Answer using Excel’s NORM.S.DIST function: ๐(−0.56 < ๐ < 0.56) = ๐(๐ < 0.56) − ๐(๐ < −0.56) = NORM.S.DIST(0.56,TRUE) – NORM.S.DIST(-0.56,TRUE) = 0.4245 Answer using Excel’s NORM.DIST function: ๐(0.55 < ๐ < 0.65) = ๐(๐ < 0.65) − ๐(๐ < 0.55) = NORM.DIST(0.65, 0.6, 0.089, TRUE) – NORM.DIST(0.55, 0.6, 0.089, TRUE) = 0.4238 IN GENERAL: = NORM.DIST( ๐, ๐, ๐๐ , TRUE) Copyright Reserved 36 More examples: Given: Suppose that 70% of the students passed the re-exam. A simple random sample of size 40 students is drawn. Question 1: Calculate the probability that more than three quarters passed the re-exam. Answer: The question is: ๐(๐ > 0.75) ๐(1−๐) ๐๐ = √ ๐ 0.7(1−0.7) =√ 40 = 0.072 ๐ง= ๐ − ๐ 0.75 − 0.7 = = 0.69006 ≈ 0.69 ๐๐ 0.072 Therefore, ๐(๐ > 0.75) = ๐(๐ > 0.69) = 1 − ๐(๐ < 0.69) = 1 − 0.7549 = 0.2451 Question 2: Let ๐ = sample proportion of students that passed for ๐ = 40. Calculate ๐ such that ๐(๐ ≤ ๐) = 0.9. Answer : ๐ง= ๐−๐ ๐๐ 1.28 = ๐ − 0.7 0.072 ๐ = (1.28)(0.072) + 0.7 = 0.792 Excel: ๐ = NORM.INV(0.9, 0.7, 0.072) = 0.792 IN GENERAL: =NORM.INV(area to the left, ๐, ๐๐ ) Copyright Reserved 37 Additional notes on absolute values ๏ท |๐ฅ| ≥ ๐: ๐ฅ ≥ ๐ or ๐ฅ ≤ −๐ ๏ท |๐ฅ| ≤ ๐: −๐ ≤๐ฅ ≤๐ Example 1: Given ๐๐ = 25. What is the probability that the sampling error of ๐ฬ is greater than 10? ๐(|๐ฬ − ๐| > 10) = ๐(๐ฬ − ๐ > 10) + ๐(๐ฬ − ๐ < −10) ๐ฬ −๐ ๐ฬ −๐ 10 = ๐( ๐ > 25) + ๐ ( ๐ ฬ ๐ ฬ ๐ < −10 25 ) = ๐(๐ > 0.4) + ๐(๐ < −0.4) = 1 − ๐(๐ < 0.4) + ๐(๐ < −0.4) = 1 − 0.6554 + 0.3446 = 0.6892. Example 2: Given ๐๐ = 25. What is the probability that the sampling error of ๐ฬ is less than 5? ๐(|๐ฬ − ๐| < 5) = ๐(−5 < ๐ฬ − ๐ < 5) −5 = ๐ ( 25 < ๐ฬ −๐ ๐๐ ฬ 5 < 25) = ๐(−0.2 < ๐ < 0.2) = ๐(๐ < 0.2) − ๐(๐ < −0.2) = 0.5793 − 0.4207 = 0.1586. Example 3: Given ๐๐ = 0.0115. What is the probability that the sampling error of ๐ is greater than 0.01? ๐(|๐ − ๐| > 0.01) = ๐(๐ − ๐ > 0.01) + ๐(๐ − ๐ < −0.01) ๐−๐ 0.01 ๐−๐ −0.01 = ๐ ( ๐ > 0.0115) + ๐ ( ๐ < 0.0115) ๐ ๐ = ๐(๐ > 0.87) + ๐(๐ < −0.87) = 1 − ๐(๐ < 0.87) + ๐(๐ < −0.87) = 1 − 0.8078 + 0.1922 = 0.3844. Example 4: Given ๐๐ = 0.0115. What is the probability that the sampling error of ๐ is less than 0.01? ๐(|๐ − ๐| < 0.01) = ๐(−0.01 < ๐ − ๐ < 0.01) −0.01 = ๐ (0.0115 < ๐−๐ ๐๐ 0.01 < 0.0115) = ๐(−0.87 < ๐ < 0.87) = ๐(๐ < 0.87) − ๐(๐ < −0.87) = 0.8078 − 0.1922 = 0.6156. Copyright Reserved 38 Chapter 7 Self Evaluation Questions Questions 1 to 3 are based on the following information: The age of soccer supporters is normally distributed with a mean of 50 years and a standard deviation of 12 years. Let ๐ = the age (in years) of a soccer supporter. ๐ฬ = the average age (in years) of 25 randomly selected soccer supporters. 1. The probability that the age (in years) of a randomly selected soccer supporter is within 10 years of the population mean is: 2. The highest 15% of ages (in years) is higher than: 3. The 25th percentile of the average age (in years) of a randomly selected sample of 25 soccer supporters is: Copyright Reserved 39 Questions 4 to 6 are based on the following information: The time that it takes a student to travel to campus by car is uniformly distributed between 10 and 50 minutes. Let ๐ = time (in minutes) that it takes a student to travel to campus by car. ๐ฬ = average time (in minutes) that it takes 40 randomly selected students to travel to campus by car. 4. The probability that a randomly selected student will travel for between 20 and 60 minutes is: 5. According to the Central Limit Theorem ๐ฬ is approximately normally distributed with ๐ = 30 and ๐๐ฬ = 6. ๐(๐ฬ > 28) = Copyright Reserved 40 Questions 7 to 9 are based on the following information: JET Airline knows that 20% of passengers are using their laptops during a flight. Let: ๐ = the number of travellers using their laptops during a flight. ๐ฬ = the sample proportion of 32 randomly selected passengers using their laptops during a flight. 7. The variance of ๐ is: 8. The probability that the sampling error of ๐ is less than 0.05 is: 9. ๐(๐ > ๐) = 0.1. The value of ๐ is: . Important: There is a connection between Chapter 5 Section 5.4: Binomial Distribution and Chapter 7 that will be discussed in class. Copyright Reserved 41 Questions 10 to 15 are based on the following information: Consider the Bargain Clothing Store. It is known that 60% of the customers prefer name brand clothing. Consider the following results in Excel: Let: ๐ = number of customers who prefer name brand clothing. ๐ = sample proportion of 30 customers who prefer name brand clothing. Given: ๐๐ = 0.0894 Formula sheet: Value sheet: 10. The probability that more than 17 but less than 26 customers will prefer name brand clothing is: 11. The probability that more than 18 customers will prefer name brand clothing is: 12. The expected number of customers who don’t prefer name brand clothing is: Copyright Reserved 42 13. A random sample of 30 customers is chosen and 22 out of 30 prefer name brand clothing. The sampling error of the proportion ๐, is: 14. The sampling distribution of ๐ can be approximated by a normal probability distribution whenever: a) b) c) d) e) ๐ = 30 and ๐๐(1 – ๐) ≥ 5 ๐ = 0.6 ๐๐ ≥ 5 ๐๐ ≥ 5 and ๐(1 – ๐) ≥ 5 ๐ = 0.6 and ๐ = 30 15. The sixtieth percentile of the distribution of ๐ is: Copyright Reserved 43 Chapter 8: Interval Estimation What do we know about the standard normal distribution? ๐ท (−๐๐ถ⁄ ≤ ๐ ≤ ๐๐ถ⁄ ) = ๐ − ๐ถ ๐ ๐ where ๐= ๐ฃ๐๐๐๐๐๐๐ − (๐๐๐๐ ๐๐ ๐ฃ๐๐๐๐๐๐๐) (๐ ๐ก๐๐๐๐๐๐ ๐๐๐ฃ๐๐๐ก๐๐๐ ๐๐ ๐ฃ๐๐๐๐๐๐๐) We are going to use this relationship to derive the interval estimate of any future unknown population parameter where the population is normally distributed. You are expected to be able to derive any interval estimate (also called a confidence interval). ๏ท Interval Estimation: ๐: ๐ฅ ± Margin of Error Confidence interval for ๐: ๐ฅ ± Margin of Error ๐ known ๐ฅ ± ๐ง๐ผ ๐๐ฅ = ๐ฅ ± ๐ง๐ผ ๐⁄ √๐ 2 2 ๐ unknown ๐ฅ ± ๐ก๐ผ ๐ ⁄ √๐ 2 8.1 Population mean: ๐ known ๐ (−๐ง๐ผ⁄2 ≤ ๐ ≤ ๐ง๐ผ⁄2 ) = 1 − ๐ผ ๐ฬ − ๐๐ฬ ๐ (−๐ง๐ผ⁄2 ≤ ≤ ๐ง๐ผ⁄2 ) = 1 − ๐ผ ๐๐ฬ ๐ฬ − ๐ ๐ (−๐ง๐ผ⁄2 ≤ ๐ ≤ ๐ง๐ผ⁄2 ) = 1 − ๐ผ ⁄ ๐ √ ๐ ๐ ๐ (−๐ง๐ผ⁄2 ≤ ๐ฬ − ๐ ≤ ๐ง๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ ๐ ๐ (−๐ฬ − ๐ง๐ผ⁄2 ≤ −๐ ≤ −๐ฬ + ๐ง๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ ๐ ๐ (๐ฬ + ๐ง๐ผ⁄2 ≥ ๐ ≥ ๐ฬ − ๐ง๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ ๐ ๐ (๐ฬ − ๐ง๐ผ⁄2 ≤ ๐ ≤ ๐ฬ + ๐ง๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ Copyright Reserved 44 Example 1 Given: Consider marks that are normally distributed with ๐ = 10. Let ๐ฅ = sample average = 58 with ๐ = 16. Question: Calculate a 95% confidence interval for ๐: Answer: 1 – 0.95 = 0.05 = ๐ผ = level of significance ๐ผ 0.05 = = 0.025 2 2 ๐ฅ ± ๐ง0.05 ๐⁄ = 58 ± (1.96)(10⁄ ) = 58 ± 4.9 √๐ √16 2 (58 − 4.9, 58 + 4.9) = (53.1, 62.9) Interpretation: We are 95% confident that the average mark ๐ is between 53.1 and 62.9. Margin of Error = ๐ง0.025 ๐⁄ = (1.96) (10⁄ ) = 4.9 √๐ √16 95% of the time the sampling error: |๐ − ๐| = |๐ − ๐| will be 4.9 or less. OR There is a 0.95 probability that the sample mean, ๐ฅ, will provide a sampling error of 4.9 or less. Copyright Reserved 45 Example 2 Given: Consider marks that are normally distributed with ๐ = 10. Let ๐ฅ = sample average = 58 with ๐ = 16. Question: Calculate a 90% confidence interval for ๐: Answer: 1 – 0.9 = 0.1 = ๐ผ = level of significance ๐ผ 0.1 = = 0.05 2 2 ๐ฅ ± ๐ง0.1 ๐⁄ = 58 ± (1.645) (10⁄ ) = 58 ± 4.1125 √๐ √16 2 (58 − 4.1125, 58 + 4.1125) = (53.8875, 62.1125) Interpretation: We are 90% confident that the average mark ๐ is between 53.8875 and 62.1125. Margin of Error = ๐ง0.05 ๐⁄ = (1.645) (10⁄ ) = 4.1125 √๐ √16 90% of the time the sampling error : |๐ − ๐| = |๐ − ๐| will be 4.1125 or less. OR There is a 0.90 probability that the sample mean, ๐ฬ , will provide a sampling error of 4.1125 or less. Copyright Reserved 46 Example 3 Given: Consider marks that are normally distributed with ๐ = 10. Let ๐ฅ = sample average = 58 with ๐ = 16. Question: Calculate a 99% confidence interval for ๐: Answer: 1 – 0.99 = 0.01 = ๐ผ = level of significance ๐ผ 0.01 = = 0.005 2 2 ๐ฅ ± ๐ง0.01 ๐⁄ = 58 ± (2.576) (10⁄ ) = 58 ± 6.44 √๐ √16 2 (58 − 6.44, 58 + 6.44) = (51.56, 64.44) Interpretation: We are 99% confident that the average mark ๐ is between 51.56 and 64.44. Margin of Error = ๐ง0.005 ๐⁄ = (2.576) (10⁄ ) = 6.44 √๐ √16 99% of the time the sampling error : |๐ − ๐| = |๐ − ๐| will be 6.44 or less. OR There is a 0.99 probability that the sample mean, ๐ฅ, will provide a sampling error of 6.44 or less. Copyright Reserved 47 Useful summary – Two sided confidence intervals Confidence Level Confidence coefficient ๐ถ ๐ถ ๐ ๐๐ถ Margin of Error 90% 0.90 0.10 0.05 1.645 1.645๐๐ 95% 0.95 0.05 0.025 1.960 1.960๐๐ 99% 0.99 0.01 0.005 2.576 2.576๐๐ Note: ๐ • ๐ผ = level of significance • 1 − ๐ผ = confidence coefficient • level of significance + confidence coefficient = 1 Very important: Ensure that you are able to use the normal probability tables to find the values given in the table above. Exercise: Derive an upper one-sided confidence interval for ๐ for the case where ๐ is known. Hint: Start your derivation using the following statement ๐(๐ ≥ ๐ง๐ผ ) = 1 − ๐ผ Exercise: Derive a lower one-sided confidence interval for ๐ for the case where ๐ is known. Hint: Start your derivation using the following statement ๐(๐ ≤ ๐ง๐ผ ) = 1 − ๐ผ Copyright Reserved 48 8.2 Population mean: ๐ unknown ๐ (−๐ก๐ผ⁄2 ≤ ๐ ≤ ๐ก๐ผ⁄2 ) = 1 − ๐ผ ๐ฬ − ๐ ๐ (−๐ก๐ผ⁄2 ≤ ๐ ≤ ๐ก๐ผ⁄2 ) = 1 − ๐ผ ⁄ ๐ √ ๐ ๐ ๐ (−๐ก๐ผ⁄2 ≤ ๐ฬ − ๐ ≤ ๐ก๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ ๐ ๐ (−๐ฬ − ๐ก๐ผ⁄2 ≤ −๐ ≤ −๐ฬ + ๐ก๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ ๐ ๐ (๐ฬ + ๐ก๐ผ⁄2 ≥ ๐ ≥ ๐ฬ − ๐ก๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ ๐ ๐ (๐ฬ − ๐ก๐ผ⁄2 ≤ ๐ ≤ ๐ฬ + ๐ก๐ผ⁄2 ) = 1 − ๐ผ √๐ √๐ ๐ฅ ± ๐ก๐ผ ๐ ⁄ √๐ 2 Relationship between the normal and t – distributions Characteristics of the t-distribution: ๏ผ Symmetric around 0. ๏ผ Has one parameter called the degrees of freedom (๐๐), given by ๐ − 1. ๏ผ As the degrees of freedom increase, the t-distribution tends to the standard normal distribution. ๏ผ For “large sample cases” (๐ ≥ 30) the t-distribution approaches the standard normal distribution. In Figure 1 it is illustrated that as the degrees of freedom increase, i.e. as ๐ − 1 increases, i.e. as the sample size ๐ increases, the t-distribution tends to the standard normal distribution. Figure 1 ๏ท ๏ท ๏ท The bottom curve represents the t-distribution with 4 degrees of freedom (denoted t(4)); The middle curve represents the t-distribution with 10 degrees of freedom (denoted t(10)); The top curve represents both the t-distribution with ๐๐ tending to infinity (denoted t(∞)) and the standard normal distribution (denoted z-distribution). Copyright Reserved 49 The importance of this? Take note of the following: 95% confidence interval When working with a 95% confidence interval using the standard normal distribution we have: The ๐ง๐ผ⁄2 value is obtained using the standard normal table. When working with a 95% confidence interval using the t-distribution where the ๐๐ tends to infinity we have: The ๐ก๐ผ⁄2 value is obtained using the t-table with area in the upper tail = 0.025 and ๐๐ = ∞. Note: The ๐ง๐ผ⁄2 and ๐ก๐ผ⁄2 values are the same, since the t-distribution tends to the standard normal distribution as the ๐๐ increase. 90% confidence interval When working with a 90% confidence interval using the standard normal distribution we have: The ๐ง๐ผ⁄2 value is obtained using the standard normal table. When working with a 90% confidence interval using the t-distribution where the ๐๐ tends to infinity we have: The ๐ก๐ผ⁄2 value is obtained using the t-table with area in the upper tail = 0.05 and ๐๐ = ∞. Note: The ๐ง๐ผ⁄2 and ๐ก๐ผ⁄2 values are the same, since the t-distribution tends to the standard normal distribution as the ๐๐ increase. Copyright Reserved 50 99% confidence interval When working with a 99% confidence interval using the standard normal distribution we have: The ๐ง๐ผ⁄2 value is obtained using the standard normal table. When working with a 99% confidence interval using the t-distribution where the ๐๐ tends to infinity we have: The ๐ก๐ผ⁄2 value is obtained using the t-table with area in the upper tail = 0.005 and ๐๐ = ∞. Note: The ๐ง๐ผ⁄2 and ๐ก๐ผ⁄2 values are the same, since the t-distribution tends to the standard normal distribution as the ๐๐ increase. Copyright Reserved 51 Example 1 Given: ๐ = 15, ๐ฅ = 53.87 and ๐ = 6.82. Take note: ๐ is unknown (Why?) Question: Calculate a 95% confidence interval for ๐: Answer: 1 – 0.95 = 0.05 = ๐ผ = level of significance ๐ผ 0.05 = = 0.025 2 2 ๐๐ = ๐ − 1 = 15 − 1 = 14 ๐ฅ ± ๐ก๐ผ ๐ ⁄ = 53.87 ± (2.145) (6.82⁄ ) = 53.87 ± 3.78 √๐ √15 2 (53.87 − 3.78, 53.87 + 3.78) = (50.09, 57.65) Interpretation: We are 95% confident that the unknown population parameter ๐ is between 50.09 and 57.65. Obtaining ๐ก๐ผ using Excel: 2 = T.INV.2T(๐ผ, df) = T.INV.2T(0.05, 14) = 2.144787 Margin of Error: ๐ก๐ผ ๐ ⁄ = (2.145) (6.82⁄ ) = 3.78 2 √๐ √15 95% of the time the sampling error: |๐ − ๐| = |๐ − ๐| will be 3.78 or less. OR There is a 0.95 probability that the sample mean, ๐ฅ, will provide a sampling error of 3.78 or less. Copyright Reserved 52 Example 2 Given: ๐ = 15, ๐ฅ = 53.87 and ๐ = 6.82. Take note: ๐ is unknown Question: Calculate a 90% confidence interval for ๐: Answer: 1 – 0.9 = 0.1 = ๐ผ = level of significance ๐ผ 0.1 = = 0.05 2 2 ๐๐ = ๐ − 1 = 15 − 1 = 14 ๐ฅ ± ๐ก๐ผ ๐ ⁄ = 53.87 ± (1.761) (6.82⁄ ) = 53.87 ± 3.1 √๐ √15 2 (53.87 − 3.1, 53.87 + 3.1) = (50.77, 56.97) Interpretation: Thus we are 90% confident that the unknown population parameter ๐ is between 50.77 and 56.97. Obtaining ๐ก๐ผ using Excel: 2 = T.INV.2T(๐ผ, df) = T.INV.2T(0.1, 14) = 1.76131 Margin of Error: ๐ก๐ผ ๐ ⁄ = (1.761) (6.82⁄ ) = 3.1 2 √๐ √15 90% of the time the sampling error : |๐ − ๐| = |๐ − ๐| will be 3.1 or less. OR There is a 0.90 probability that the sample mean, ๐ฅ, will provide a sampling error of 3.1 or less. Copyright Reserved 53 8.3 Determining the Sample Size Additional information on the range rule used to obtain a planning value for ๐: http://statistics.about.com/od/Descriptive-Statistics/a/Range-Rule-For-Standard-Deviation.htm Copyright Reserved 54 8.4 Population proportion ๐ = population proportion ๐ = sample proportion Interval estimate of a population proportion: ๐ ± Margin of error ๐ ± ๐ง๐ผ ๐๐ 2 ๐ ± ๐ง๐ผ √ 2 ๐(1 − ๐) ๐ To use this expression to develop an interval estimate of a population proportion, ๐, the value of ๐ would have to be known. But, the value of ๐ is what we are trying to estimate, so we simply substitute the sample proportion ๐ for ๐. Therefore, ๐ ± ๐ง๐ผ √ 2 ๐(1 − ๐) ๐ Homework: Derive the above expression for a (1 − ๐ผ) × 100% confidence interval for ๐. Hint: The first step has been given. ๐ (−๐ง๐ผ ≤ ๐ ≤ ๐ง๐ผ ) = 1 − ๐ผ 2 2 Copyright Reserved 55 Example: Female Golfers Given: A national survey of 902 female golfers was taken to learn how women golfers view themselves as being treated at golf courses. The survey found that 397 if the female golfers felt that they were being treated fairly. Let ๐ = the proportion of female golfers who feel they are being treated fairly Question: Calculate a 95% confidence interval for ๐. Answer: ๐= 397 = 0.4401 902 (0.4401)(1 − 0.4401) ๐(1 − ๐) ๐ ± ๐ง๐ผ √ = 0.4401 ± 1.96√ = 0.4401 ± 0.0324 ๐ 902 2 (0.4401 − 0.0324, 0.4401 + 0.0324) = (0.4077, 0.4725) Interpretation: We are 95% confident that the proportion of female golfers who feel they are being treated fairly is between 0.4077 and 0.4725. ๐(1−๐) Margin of error: ๐ง๐ผ √ 2 ๐ (0.4401)(1−0.4401) = 1.96√ 902 = 0.0324 95% of the time the sampling error : |๐ − ๐| = |๐ − ๐| will be 0.0324 or less. OR There is a 0.95 probability that the sample proportion, ๐, will provide a sampling error of 0.0324 or less. Question: Calculate a 90% confidence interval for ๐. Answer: (0.4401)(1 − 0.4401) ๐(1 − ๐) ๐ ± ๐ง๐ผ √ = 0.4401 ± 1.645√ = 0.4401 ± 0.027189 ๐ 902 2 (0.4401 − 0.027189, 0.4401 + 0.027189) = (0.4129, 0.467) Interpretation: We are 90% confident that the proportion of female golfers who feel they are being treated fairly is between 0.4129 and 0.467. ๐(1−๐) Margin of error: ๐ง๐ผ √ 2 ๐ (0.4401)(1−0.4401) = 1.645√ 902 = 0.027189 90% of the time the sampling error : |๐ − ๐| = |๐ − ๐| will be 0.027189 or less. OR There is a 0.9 probability that the sample proportion, ๐, will provide a sampling error of 0.027189 or less. Copyright Reserved 56 Lab session component 3: Confidence intervals in Excel Outcomes: At the end of this section you should be able to ๏ท calculate and interpret confidence intervals for the population mean using the confidence.norm() and confidence.t() functions or by setting up your own function, ๏ท calculate and interpret confidence intervals for the population proportion using the confidence.norm() function or by setting up your own function in Excel, ๏ท identify which of these functions are appropriate to use in a given practical problem. L3.1: Confidence intervals for the population mean (๐ known case) A (1 − ๐ผ) × 100% confidence interval for ๐ in the ๐ known case is given by ๐ ๐ฅฬ ± ๐ง๐ผ⁄2 √๐ where ๐ฅฬ is the observed sample mean, ๐ is the sample size, ๐ is the population standard deviation and ๐ง๐ผ⁄2 is a normal percentile. From Section L2.4.2 we know that the value of ๐ง๐ผ⁄2 can be found using the norm.s.inv() function, and this value can then be used to calculate the margin of error for the confidence interval given above. There is however an even simpler method of doing this in Excel, namely the confidence.norm() function. This function calculates the margin of error for a two-sided confidence interval and uses the following syntax: confidence.norm(alpha, standard_dev, size) where standard_dev refers to the population standard deviation and size refers to the sample size. It is important to note that alpha refers to the level of significance, not ๐ผ⁄2, as illustrated in the following example. Example: The margin of error for a 98% confidence interval for the population mean with ๐ = 10 and ๐ = 123 can be found by typing the following in Excel: =confidence.norm(0.02,10,123). L3.2: Confidence intervals for the population mean (๐ unknown case) A (1 − ๐ผ) × 100% confidence interval for ๐ in the ๐ unknown case is given by ๐ ๐ฅฬ ± ๐ก๐−1,๐ผ⁄2 √๐ where ๐ฅฬ is the observed sample mean, ๐ is the sample size, ๐ is the observed sample standard deviation and ๐ก๐−1,๐ผ⁄2 is a percentile of the t-distribution with ๐ − 1 degrees of freedom. We can calculate the margin of error manually by finding a t-value using the t.inv() function. Alternatively, the margin of error can be calculated directly in Excel using the confidence.t() function. This function calculates the margin of error for a two-sided confidence interval and uses the following syntax: confidence.t(alpha, standard_dev, size) where alpha again refers to the level of significance, standard_dev refers to the sample standard deviation, and size refers to the sample size. Example: The margin of error for a 98% confidence interval for the population mean with ๐ = 10 and ๐ = 123 can be found by typing the following in Excel: =confidence.t(0.02,10,123). Copyright Reserved 57 L3.3: Confidence intervals for the population proportion A (1 − ๐ผ) × 100% confidence interval for ๐ is given by ๐ฬ (1 − ๐ฬ ) ๐ฬ ± ๐ง๐ผ⁄2 √ ๐ where ๐ฬ is the sample proportion, ๐ is the sample size and ๐ง๐ผ⁄2 is a normal percentile. Unfortunately there are no built-in Excel functions to calculate the margin of error like we had in the previous sections. The user therefore either needs to enter the formula manually in Excel or adapt the formula used in Section L3.1. L3.4: Self-evaluation Exercise 3 Consider the ‘EAI.xlsx’ file that contains the salaries of 2500 managers as well as the details of whether they completed a training program. 1. Calculate a 96% confidence interval for the population mean. Solution: [51635.61; 51964.39] 2. Calculate a 92% confidence interval for the proportion of managers who completed the training program. Solution: [ 0.58285; 0.61715 ] Chapter 8 Self Evaluation Questions Questions 1 and 2 are based on the following information: The proportion of business travellers who are dissatisfied with the service of an airline is investigated. A manager selects a systematic sample of 50 business travellers, from which 10 said that they are dissatisfied with the service. Let: ๐ = population proportion of dissatisfied business travellers 1. The lower limit of a 95% confidence interval for the population proportion is: 2. If the confidence coefficient of a confidence interval decreases from 0.95 to 0.90 the: a) sample size increases. b) interval is narrower. c) significance level is smaller. d) margin of error is larger. e) standard error is larger. 3. A Business travel magazine rates the service of airlines on a regular basis (the rating scale with a low score of 0 and a high score of 10 was used). It is known that \sigma=1.05. An airline was rated by 30 randomly selected business travellers which provided a sample mean of 7.5 The upper limit of a 99% confidence interval for the population mean is: Copyright Reserved 58 Questions 4 to 10 are based on the following information: The management team of a soccer stadium wants to estimate the average amount (in Rand) spent on snacks and cool drinks per spectator. It is known that the amount (in Rand) is normally distributed. They are also interested in the method of payment used for the purchase namely, credit card or cash. Let: ๐ = the population mean of the amount (in Rand) spent on snacks and cool drinks per spectator. ๐ = the population proportion of spectators who paid with cash. ๐ฅฬ = the average amount (in Rand) spent on snacks and cool drinks. Consider the following results in Excel: Formula worksheet: Note: Rows 10 to 40 are hidden. Value worksheet: Note: Rows 10 to 40 are hidden 4. The point estimate for the population mean of the amount (in Rand) spent is: 5. The point estimate of the population proportion is: Copyright Reserved 59 6. When ๐ is used to estimate ๐, the interval estimate for the population mean is based on the: a) b) c) d) e) standard normal distribution binomial distribution normal distribution ๐ก-distribution uniform distribution 7. The margin of error of a 95% confidence interval for ๐ is: 8. The lower limit of a 99% confidence interval for the population proportion is: 9. The margin of error of a 95% confidence interval for the population proportion is: 10. If the confidence coefficient of a confidence interval for ๐ is decreased from 95% to 90%, then: a) b) c) d) e) the standard error decreases, which implies a narrower interval. the standard error increases, which implies a wider interval. the sample size increases, which implies a narrower interval. lower limit increases, which implies a wider interval. the margin of error decreases, which implies a narrower interval. Copyright Reserved 60 Chapter 9 Hypothesis tests 9.1 Developing the null and alternative hypotheses ๐ฏ๐ : โ Null hypothesis โ Tentative assumption about a population parameter ๐ฏ๐ : โ Alternative hypothesis โ Opposite of what is stated in ๐ป0 โ Research hypothesis Different types of hypotheses about the population mean: ๐0 = a specific numerical value ๐ = population mean One-tailed test Lower tail test Upper tail test ๐ป0 : ๐ ≥ ๐0 ๐ป๐ : ๐ < ๐0 ๐ป0 : ๐ ≤ ๐0 ๐ป๐ : ๐ > ๐0 Two-tailed test ๐ป0 : ๐ = ๐0 ๐ป๐ : ๐ ≠ ๐0 Testing research hypotheses: Testing the validity of a claim: Testing in decision-making situations: A car model currently attains an average fuel efficiency of 24 miles per gallon. A product research group has developed a new fuel injection system specifically designed to increase the miles-per-gallon rating. A manufacturer of soft drinks states that 2-liter containers of its products have an average of at least 67.6 fluid ounces. Assume the specifications for a particular part requires a mean length of 2 inches per part. If the mean length is greater or less than the 2inch standard, the parts will cause quality problems in the assembly operation. ๐ป0 : ๐ ≤ 24 ๐ป๐ : ๐ > 24 (Alternative hypothesis / Research hypothesis) ๐ป0 : ๐ ≥ 67.6 (Manufacturer’s claim) ๐ป๐ : ๐ < 67.6 ๐ป0 : ๐ = 2 ๐ป๐ : ๐ ≠ 2 Copyright Reserved 61 9.2 Type I and Type II Errors ๏ Type I Error We reject ๐ป0 , given ๐ป0 is true. The probability of making a Type I error is called the level of significance for the test, denoted by ๐ผ. ๐ผ = ๐(Reject ๐ป0 | ๐ป0 true) Errors and correct conclusions in hypothesis testing: True state in population ๐ฏ๐ true ๐ฏ๐จ true We do not reject ๐ป0 , given ๐ป๐ is true. ๐ฝ = ๐(Do not reject ๐ป0 | ๐ป๐ is true) Conclusion ๏ Type II Error Do not Reject ๐ฏ๐ Correct decision Type II error Reject ๐ฏ๐ Type I error Correct decision Note that we NEVER accept ๐ป0 or ๐ป๐ด !!! 9.3 Population mean: ๐ known 9.4 Population mean: ๐ unknown ๐ฅ − ๐0 ๐ง= ๐ ⁄ ๐ √ ๐ฅ − ๐0 ๐ก= ๐ ⁄ ๐ √ 9.3 Population mean: ๐ known Lower tail test Upper tail test Two-tailed test Hypotheses ๐ป0 : ๐ ≥ ๐0 ๐ป๐ : ๐ < ๐0 ๐ป0 : ๐ ≤ ๐0 ๐ป๐ : ๐ > ๐0 ๐ป0 : ๐ = ๐0 ๐ป๐ : ๐ ≠ ๐0 Test statistic ๐ฅ − ๐0 ๐ง= ๐ ⁄ ๐ √ ๐ฅ − ๐0 ๐ง= ๐ ⁄ ๐ √ ๐ฅ − ๐0 ๐ง= ๐ ⁄ ๐ √ Rejection rule: Critical value approach Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ Reject ๐ป0 if ๐ง ≥ ๐ง๐ผ Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ or if ๐ง ≥ ๐ง๐ผ Rejection rule: p-value approach Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ 2 2 Copyright Reserved 62 Copyright Reserved 63 Example 1: Given: The label on a large can of coffee states that the can contains at least 3 kg of coffee. ๐ = 36 coffee cans, ๐ฅ = 2.92kg, ๐ = 0.18kg and ๐ผ = 0.01. (Note: ๐ is known) Answer: Using the critical value approach Using the p-value approach Hypotheses: ๐ป0 : ๐ ≥ 3 ๐ป๐ : ๐ < 3 Graph: Obtaining the p-value: Rejection rule / rejection criteria: Rejection rule / rejection criteria: Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ Reject ๐ป0 if ๐ง ≤ −2.33 Reject ๐ป0 if p-value ≤ ๐ผ Test statistic: ๐ฅ − ๐0 2.92 − 3 ๐ง= ๐ = = −2.67 0.18⁄ ⁄ ๐ √ √36 p-value: ๐ − ๐ฃ๐๐๐ข๐ = ๐(๐ < −2.67) = 0.0038 Decision: Decision: Reject ๐ป0 at a 1% level of significance since the test statistic (๐ง = −2.67) is less than the critical value (๐ง0.01 = −2.33). Reject ๐ป0 at a 1% level of significance, since p-value (0.0038) < ๐ผ (0.01). Conclusion: At 1% level of significance we have enough evidence to conclude that the mean weight of a can of coffee is less than 3kg. Copyright Reserved 64 Example 2: Given: Max Flight uses a high-technology manufacturing process to produce golf balls with a mean driving range distance of 295 yards. The process is out of adjustment if the driving distance deviates from 295 yards. ๐ = 50, ๐ฅ = 297.6, ๐ = 12 and ๐ผ = 0.05. (Note: ๐ is known) Answer : Using the critical value approach Using the p-value approach Hypotheses: ๐ป0 : ๐ = 295 ๐ป๐ : ๐ ≠ 295 Graph: Obtaining the p-value: Rejection rule / rejection criteria: Rejection rule / rejection criteria: Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ or if ๐ง ≥ ๐ง๐ผ Reject ๐ป0 if p-value ≤ ๐ผ 2 2 Reject ๐ป0 if ๐ง ≤ −1.96 or if ๐ง ≥ 1.96 Test statistic: ๐ฅ − ๐0 297.6 − 295 ๐ง= ๐ = = 1.53 12⁄ ⁄ ๐ √ √50 p-value: ๐ − ๐ฃ๐๐๐ข๐ = 2 × ๐(๐ > 1.53) = 2 × 0.063 = 0.126 Decision: Decision: Do not reject ๐ป0 at a 5% level of significance since the test statistic (๐ง = 1.53) lies between the critical values (±๐ง0.025 = ±1.96). Do not reject ๐ป0 at a 5% level of significance, since p-value (0.126) > ๐ผ (0.05). Conclusion: Thus, at a 5% level of significance, the evidence is insufficient to indicate that the mean driving range deviates from 295 yards. Copyright Reserved 65 9.4 Population mean: ๐ unknown Lower tail test Upper tail test Two-tailed test Hypotheses ๐ป0 : ๐ ≥ ๐0 ๐ป๐ : ๐ < ๐0 ๐ป0 : ๐ ≤ ๐0 ๐ป๐ : ๐ > ๐0 ๐ป0 : ๐ = ๐0 ๐ป๐ : ๐ ≠ ๐0 Test statistic ๐ฅ − ๐0 ๐ก= ๐ ⁄ ๐ √ ๐ฅ − ๐0 ๐ก= ๐ ⁄ ๐ √ ๐ฅ − ๐0 ๐ก= ๐ ⁄ ๐ √ Rejection rule: Critical value approach Reject ๐ป0 if ๐ก ≤ −๐ก๐ผ Reject ๐ป0 if ๐ก ≥ ๐ก๐ผ Reject ๐ป0 if ๐ก ≤ −๐ก๐ผ or if ๐ก ≥ ๐ก๐ผ Rejection rule: p-value approach Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ 2 2 Copyright Reserved 66 Example 1: Given: A magazine has decided to classify airports according to a rating they received. Airports that have a population mean rating of more than 7 will be designated as superior service airports. ๐ = 60, ๐ฅ = 7.25, ๐ = 1.052 and ๐ผ = 0.05. (Note: ๐ is unknown) Answer: Using the critical value approach Using the p-value approach Hypotheses: ๐ป0 : ๐ ≤ 7 ๐ป๐ : ๐ > 7 Graph: Obtaining the p-value: Rejection rule / rejection criteria: Rejection rule / rejection criteria: Reject ๐ป0 if ๐ก ≥ ๐ก๐ผ Reject ๐ป0 if ๐ก ≥ 1.671 Reject ๐ป0 if p-value ≤ ๐ผ Test statistic: ๐ฅ − ๐0 7.25 − 7 ๐ก= ๐ = = 1.84 1.052⁄ ⁄ ๐ √ √60 p-value: ๐ − ๐ฃ๐๐๐ข๐ = ๐(๐ > 1.87) = 0.0354 (๐ข๐ ๐๐๐ ๐ธ๐ฅ๐๐๐) From the probability tables: 0.025 < p-value < 0.05 Decision: Decision: Reject ๐ป0 at a 5% level of significance since the test statistic (๐ก = 1.84) is greater than the critical value (๐ก59,0.05 = 1.671). Reject ๐ป0 at a 5% level of significance, since the p-value < ๐ผ (0.05). Conclusion: At 5% level of significance it can be concluded that the mean rating for airports is more than 7. Copyright Reserved 67 More examples on p-values: Obtain the p-values for the following scenarios for one and two-sided tests. Hint: Sketch the graphs for left-sided, right-sided and two-sided test. Example 2: ๐ = 10, ๐ก = 2. What is the p-value? Answer: ๐๐ = ๐ − 1 = 10 − 1 = 9 Therefore, Example 3: ๐ = 20, ๐ก = 1. What is the p-value? Answer: ๐๐ = ๐ − 1 = 20 − 1 = 19 Therefore, Example 4: ๐ = 7, ๐ก = 9.33. What is the p-value? Answer: ๐๐ = ๐ − 1 = 7 − 1 = 6 Therefore, Copyright Reserved 68 9.5 Population proportion Different types of hypotheses about the population proportion: ๐0 = a specific numerical value ๐ = population proportion One-tailed test Lower tail test Upper tail test ๐ป0 : ๐ ≥ ๐0 ๐ป๐ : ๐ < ๐0 ๐ป0 : ๐ ≤ ๐0 ๐ป๐ : ๐ > ๐0 Hypotheses Test statistic Two-tailed test ๐ป0 : ๐ = ๐0 ๐ป๐ : ๐ ≠ ๐0 Lower tail test Upper tail test Two-tailed test ๐ป0 : ๐ ≥ ๐0 ๐ป๐ : ๐ < ๐0 ๐ป0 : ๐ ≤ ๐0 ๐ป๐ : ๐ > ๐0 ๐ป0 : ๐ = ๐0 ๐ป๐ : ๐ ≠ ๐0 ๐ง= ๐ − ๐0 √๐0 (1 − ๐0 ) ๐ ๐ง= ๐ − ๐0 √๐0 (1 − ๐0 ) ๐ ๐ง= ๐ − ๐0 √๐0 (1 − ๐0 ) ๐ Rejection rule: Critical value approach Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ Reject ๐ป0 if ๐ง ≥ ๐ง๐ผ Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ or if ๐ง ≥ ๐ง๐ผ Rejection rule: p-value approach Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ 2 2 Copyright Reserved 69 Example: Given: ๐ป0 : ๐ ≤ 0.2 ๐ป๐ : ๐ > 0.2 ๐ = 0.25, ๐ = 400 and ๐ผ = 0.05. Answer: Using the critical value approach Using the p-value approach Hypotheses: ๐ป0 : ๐ ≤ 0.2 ๐ป๐ : ๐ > 0.2 Graph: Obtaining the p-value: Rejection rule / rejection criteria: Rejection rule / rejection criteria: Reject ๐ป0 if ๐ง ≥ ๐ง๐ผ Reject ๐ป0 if ๐ง ≥ 1.645 Reject ๐ป0 if p-value ≤ ๐ผ ๐ง= Test statistic: ๐ − ๐0 0.25 − 0.2 √๐0 (1 − ๐0 ) ๐ = √0.2(1 − 0.2) 400 = 2.50 p-value: ๐ − ๐ฃ๐๐๐ข๐ = ๐(๐ > 2.50) = 0.0062 Decision: Decision: Reject ๐ป0 at a 5% level of significance since the test statistic (๐ง = 2.5) is greater than the critical value (๐ง0.05 = 1.645). Reject ๐ป0 at a 5% level of significance, since pvalue (0.0062) < ๐ผ (0.05). Conclusion: At 5% level of significance it can be concluded that the population proportion is greater than 0.2. Copyright Reserved 70 Lab session component 4: Hypothesis testing in Excel Outcomes: At the end of this section you should be able to ๏ท use and understand the hypothesis testing template for the population mean in the case where ๐ is known, ๏ท set up and use a hypothesis testing template for the population mean in the case where ๐ is unknown, ๏ท set up and use a hypothesis testing template for the population proportion, ๏ท set up and use hypothesis testing template for the difference in population means in the case where ๐1 and ๐2 are known, unknown but assumed equal and unknown and not assumed equal, ๏ท set up and use a hypothesis testing template for the difference in population proportions, and ๏ท use your hypothesis testing templates to test hypotheses in the context of Project Work and interpret these results. L4.1: Hypothesis tests for the population mean (๐ known case) Hypothesis tests can be performed quite easily in Excel by making use of custom made hypothesis testing templates. Decisions about these tests can be made based on a p-value approach. This is most easily explained by making use of an example. Example: Suppose that a manufacturer of golf balls believes that they have developed a new, more aerodynamic golf ball. The manufacturer believes that the new ball has an improved driving range of more than 295 yards. This hypothesis is given by ๐ป0 : ๐ = 295 ๐ป๐ด : ๐ > 295 To test this belief, a sample of 50 golf balls is tested and the driving range for each ball is noted. These values are contained in the ‘GolfTest.xlsx’ file. In order to perform this test, the data first needs to be copied into column A of the template. The template then calculates the sample size, sample mean, standard error of the sample mean, test statistic as well as p-values for lower, upper and two sided tests. The output can be seen in Figure 5. It should be noted that the user needs to enter the values for the population standard deviation as well as the hypothesized value. Figure 5: The formulae and values obtained when using the hypothesis test template designed for tests for the population mean in the case where σ is known. Copyright Reserved 71 The decision for the hypothesis test can now be made by looking at the relevant p-value. In our example we are performing an upper tailed test. From the output we obtain a p-value of 0.0628. We can therefore reject the null hypothesis at a 10% level of significance, but not at a 5% level of significance. L4.2: Self-evaluation Exercise 4 Set up hypothesis testing templates for the population mean for the case when the population variance is unknown. Complete the following exercise. 1. Review Section L4.1 and make sure that your understand how to use a hypothesis testing template to aid you in solving hypothesis testing problems. Also review Section L3.1 – L3.2 and your Class Notes Book to understand the link between hypothesis testing and confidence intervals. Add functions to your hypothesis testing templates for the mean (both ๐ known and unknown cases) to calculate (a) critical values for a hypothesis test based on a given level of significance, ๐ผ. (b) two-sided confidence intervals test based on a given level of significance, ๐ผ. (c) one-sided confidence intervals test based on a given level of significance, ๐ผ. 2. Set up a hypothesis testing template that can be used to solve problems involving the population proportion. 3. Set up a hypothesis testing template that can be used to solve problems involving the difference of two population means for the case where σ 1 and σ 2 are known. 4. Set up a hypothesis testing template that can be used to solve problems involving the difference of two population means for the case where σ 1 and σ 2 are unknown but assumed equal. 5. Set up a hypothesis testing template that can be used to solve problems involving the difference of two population means for the case where σ 1 and σ 2 are unknown but not assumed equal. 6. Set up a hypothesis testing template that can be used to solve problems involving the difference of two population proportions. 7. Consider the templates set up in questions 2 - 6. Add appropriate functions to these templates in order to calculate (a) critical values for a hypothesis test based on a given level of significance, ๐ผ. (b) two-sided confidence intervals test based on a given level of significance, ๐ผ. (c) one-sided confidence intervals test based on a given level of significance, ๐ผ. Copyright Reserved 72 Chapter 9 Self Evaluation Questions Questions 1 to 5 are based on the following information: According to regulations the maximum registered baggage weight is 20kg. Passengers want to investigate the matter because they know that their baggage weight was less than 20kg and they had to pay an unfair penalty for overweight baggage. A simple random sample of 12 pieces of baggage was selected and the weights (in kg) were recorded as follows: 17.7 19.7 20.5 17.8 20 21.9 18.5 19.4 17.8 11.8 16.9 14 Given: Test statistic t = - 2.472 Test at ๐ผ = 0.01 whether the average baggage weight is less than 20kg. 1. The hypothesis that is tested here, is: 2. The point estimate of the population mean is: 3. The point estimate of the population standard deviation is: 4. The p-value is in the interval: 5. The average baggage weight is: a) b) c) d) e) significantly less than 20kg, because t > -2.681 not significantly less than 20kg, because t > -2.718 not significantly less than 20kg, because t > -2.326 significantly less than 20kg, because t > -3.055 not significantly less than 20kg, because t > -3.106 Questions 6 to 9 are based on the following information: A certain cell phone provider wants to prove that first year students spend on average less than 100 minutes a day on Mxit. It is also known that σ = 25 minutes. To test his claim a random sample of 50 students is selected. The sample average is calculated as 90 minutes . Given: p-value = 0.0023 6. The probability that the null hypothesis is true and wrongly rejected, is called the probability of a: Copyright Reserved 73 7. The hypotheses are: 8. The value of the test statistic is: 9. Which one of the following statements is true: a) b) c) d) e) ๐ป0 cannot be rejected at a 5% level of significance. ๐ป0 can be rejected at a 5% level of significance, but not at a 2.5% level of significance. ๐ป0 can be rejected at a 2.5% level of significance, but not at a 1% level of significance. ๐ป0 can be rejected at a 1% level of significance, but not at a 0.5% level of significance. ๐ป0 can be rejected at a 0.5% level of significance. Questions 10 to 13 are based on the following information: A certain bank group claims that 40% of students are using credit cards to make a purchase. To test this claim a random sample of 80 students is selected and found that 20 out of 80 students are using a credit card to make a purchase. ๐ป : ๐ = 0.4 The hypotheses tested here at a 1% level of significance are: 0 ๐ป๐ : ๐ ≠ 0.4 Let: ๐ฬ = sample proportion of the students who pay with credit cards. Given: ๐ง = −2.74 10. The standard error of the sampling proportion under the null hypothesis is: 11. The p-value is: 12. The proportion of students who use a credit card to make a purchase: a) does not differ from 0.4 because ๐ง ≠ −2.33. b) is more than 0.4 because ๐ง > −2.576. c) differs from 0.4 because ๐ง < −2.33. d) is less than 0.4 because ๐ง < −2.576. e) differs from 0.4 because ๐ง < −2.576. 13. If the hypothesis tested changes to ๐ป0 : ๐ ≥ 0.4 , then the p-value is: ๐ป๐ : ๐ < 0.4 Copyright Reserved 74 Chapter 10: Statistical inference about means with two populations 10.1 Inferences about the difference between two population means: ๐๐ and ๐๐ known Population 1: Population 2 Inner-City Store Customers Suburban Store Customers ๐๐ = mean age of inner-city store customers ๐๐ = mean age of suburban store customers ๐1 ๐2 ๐1 − ๐2 = the difference between the two population means ๐ฅ1 − ๐ฅ2 = the point estimator of the difference between the two population means ๐ฅ1 = sample mean age for the inner-city store customers ๐ฅ2 = sample mean age for the suburban store customers Different types of hypotheses One-tailed test Lower tail test Upper tail test ๐ฏ๐ : ๐๐ − ๐๐ ≥ ๐ซ๐ ๐ฏ๐ : ๐๐ − ๐๐ < ๐ซ๐ ๐ฏ๐ : ๐๐ − ๐๐ ≤ ๐ซ๐ ๐ฏ๐ : ๐๐ − ๐๐ > ๐ซ๐ Two-tailed test ๐ฏ๐ : ๐๐ − ๐๐ = ๐ซ๐ ๐ฏ๐ : ๐๐ − ๐๐ ≠ ๐ซ๐ Copyright Reserved 75 Example: Given: As part of a study to evaluate differences in education quality between two training centers, a sample from each centre is drawn. Test at a 5% level of significance whether there is a statistically significant difference in the education quality. Training Centre A Training Centre B ๐1 = 30 ๐ฅ1 = 82 ๐1 = 10 ๐2 = 40 ๐ฅ2 = 78 ๐2 = 10 Answer: Using the critical value approach Using the p-value approach Hypotheses: ๐ป0 : ๐1 − ๐2 = 0 ๐ป๐ : ๐1 − ๐2 ≠ 0 Graph: Obtaining the p-value: Rejection rule / rejection criteria: Rejection rule / rejection criteria: Reject ๐ป0 if ๐ง ≤ −๐ง๐ผ or if ๐ง ≥ ๐ง๐ผ Reject ๐ป0 if p-value ≤ ๐ผ 2 2 Reject ๐ป0 if ๐ง ≤ −1.96 or if ๐ง ≥ 1.96 Test statistic: ๐ง= (๐ฅ1 − ๐ฅ2 ) − ๐ท0 ๐2 √ 1 ๐22 ๐1 + ๐2 = (82 − 78) − 0 2 2 √10 + 10 30 40 = 1.66 p-value: ๐ − ๐ฃ๐๐๐ข๐ = 2 × ๐(๐ > 1.66) = 2 × 0.0485 = 0.097 Decision: Do not reject ๐ป0 at a 5% level of significance since the test statistic (๐ง = 1.66) is between the critical values (±๐ง0.025 = ±1.96). Decision: Do not reject ๐ป0 at a 5% level of significance, since p-value (0.097) > ๐ผ (0.05). Conclusion: Thus, at a 5% level of significance, the evidence is insufficient to indicate that there is a difference in the education quality. Copyright Reserved 76 10.2 Inferences about the difference between two population means: ๐๐ and ๐๐ unknown Example: Given: Consider a new software package developed to reduce design, develop and implement of an information system. The researcher in charge of the new software evaluation project hopes to show that the new software package will provide a shorter mean project completion time. Use ๐ผ = 0.05. Current Technology New Software ๐๐ = 12 ๐ฅ๐ = 325 ๐ c = 40 ๐๐ = 12 ๐ฅ๐ = 286 ๐ n = 44 ๐๐ = the mean project completion time for all systems analysts using the current technology ๐๐ = the mean project completion time for all systems analysts using the new software package Answer: Using the critical value approach Using the p-value approach Hypotheses: ๐ป0 : ๐๐ − ๐๐ ≤ 0 ๐ป๐ : ๐๐ − ๐๐ > 0 Graph: Obtaining the p-value: Use t - table with ๐๐ (degrees of freedom): 2 ๐๐ = = 2 2 ๐ ๐ ( ๐+ ๐) ๐๐ ๐๐ 2 2 2 ๐ ๐ 2 1 1 ( ๐) + ( ๐) ๐๐ −1 ๐๐ ๐๐ −1 ๐๐ 2 2 40 442 ( + ) 12 12 2 2 1 402 1 442 ( ) + ( ) 12−1 12 12−1 12 = 21.8 ≈ 21 Using the t – table we find 0.01 < p-value < 0.025 Note: ๐๐ is rounded down to the nearest integer. Rejection rule / rejection criteria: Rejection rule / rejection criteria: Reject ๐ป0 if ๐ก ≥ ๐ก๐ผ Reject ๐ป0 if ๐ก ≥ 1.721 Reject ๐ป0 if p-value ≤ ๐ผ Copyright Reserved 77 Test statistic: ๐ก= (๐ฅ๐ − ๐ฅ๐ ) − ๐ท0 ๐ 2 ๐ 2 √ ๐ + ๐ ๐๐ ๐๐ = (325 − 286) − 0 2 2 √40 + 44 12 12 = 2.27 p-value: ๐ − ๐ฃ๐๐๐ข๐ = ๐(๐ > 2.27) = 0.016929 (๐ข๐ ๐๐๐ ๐ธ๐ฅ๐๐๐) From the probability tables: 0.01 < p-value < 0.025 Decision: Reject ๐ป0 at a 5% level of significance since the test statistic (๐ก = 2.27) is greater than the critical value (๐ก21,0.05 = 1.721). Decision: Reject ๐ป0 at a 5% level of significance, since pvalue < ๐ผ (0.05). Conclusion: Thus, at a 5% level of significance it can be concluded that the mean project completion time is decreased when using the new software. Alternative approach: Inferences about the difference between two population means can also be made by making the assumption that the two unknown population standard deviations are equal. Under this assumption the two sample standard deviation are combined to provide the following pooled sample variance: (๐1 − 1)๐ 12 + (๐2 − 1)๐ 22 ๐ ๐2 = ๐1 + ๐2 − 2 The t-test statistic then becomes: ๐ก= (๐ฆฬ 1 − ๐ฆฬ 2 ) − ๐ท0 1 1 ๐ ๐ √๐ + ๐ 1 2 with degrees of freedom equal to ๐1 + ๐2 − 2. Note that this assumption is difficult to verify and population variances often differ. The pooled procedure may not provide satisfactory results, especially if the sample sizes are very different. This approach should therefore be followed with caution and will work best in a situation where the two sample sizes are approximately the same. Note that the assumption of equal variance needs to be tested and cannot merely be assumed. The procedure for this test is discussed in Section 11.2. Copyright Reserved 78 Example Consider a new software package developed to reduce design, develop and implement of an information system. The researcher in charge of the new software evaluation project hopes to show that the new software package will provide a shorter mean project completion time. Use ๐ผ = 0.05 and assume ๐1 = ๐2 Current Technology New Software ๐1 = 12 ๐ฅฬ 1 = 325 ๐ 1 = 40 ๐2 = 12 ๐ฅฬ 2 = 286 ๐ 2 = 44 ๐1 = the mean project completion time for all systems analysts using the current technology ๐2 = the mean project completion time for all systems analysts using the new software package Answer (using the critical value approach): Hypotheses: ๐ป0 : ๐1 − ๐2 ≤ 0 ๐ป๐ : ๐1 − ๐2 > 0 Test statistic: (๐1 − 1)๐ 12 + (๐2 − 1)๐ 22 (12 − 1)1600 + (12 − 1)1936 ๐ ๐2 = = = 1768 ๐1 + ๐2 − 2 12 + 12 − 2 ๐ก= (๐ฅฬ 1 − ๐ฅฬ 2 ) − ๐ท0 1 1 ๐ ๐ √๐ + ๐ 1 2 = 325 − 286 √1768 ( 2 ) 12 ≈ 2.272 Rejection rule: Degrees of freedom = 12 + 12 − 2 = 22 Decision: Reject ๐ป0 Conclusion: At a 5% level of significance we have enough evidence to conclude that the new software package will provide a shorter mean project completion time Copyright Reserved 79 10.3 The Difference Between Two Population Means: Matched pairs (p438) In the previous two sections we assumed that the elements in the two samples were obtained independently of each other. If for example we wanted to test the effectiveness of two different methods of assembly, we could train one set of workers to use method A and another separate group of workers to use method B. We can then select a sample from each of these groups. These two samples will be independent of each other since the workers using method A are independent of the workers using method B. If we however trained all workers to use both methods, we could again randomly select a sample of workers. Each selected worker would then be expected to perform the assembly using both method A and method B. The order in which the methods are used will be randomly assigned to each worker, some performing A first, others performing B first. We will therefore end up with a pair of observations for each of the workers. The set of observations for workers using method A will be our first sample whilst the observations obtained using method B is the second sample. This type of sampling design is known as a matched sample design and it is clear that the observations in the two samples are dependent. In a matched sample design the different methods are tested under similar conditions. This usually means that the sampling error is smaller for matched designs than for independent designs. The main reason for this is that the individual variation between observations in the two samples is eliminated since the same elements are observed in both samples. Example: Suppose that a shoe company wants to test material for the soles of shoes. For each pair of shoes, the new material is placed on one shoe and the old material on the other shoe. After a given period of time, a random sample of ten pairs of shoes is selected. The wear is measured on a ten-point scale (higher is better) with the following results: Pair number 1 2 3 4 5 6 7 8 9 10 New material 2 4 5 7 7 5 9 8 8 7 Old material 4 5 3 8 9 4 7 8 5 6 Test at a 1% level of significance whether the average wear for the new material is better than of the old material. Copyright Reserved 80 10.4 The Difference Between Two Population Proportions: (p446) Interval estimation: ๐ (−๐ง๐ผ⁄2 ≤ ๐ ≤ ๐ง๐ผ⁄2 ) = 1 − ๐ผ ๐ (−๐ง๐ผ⁄2 ≤ (๐ฬ 1 − ๐ฬ 2 ) − (๐1 − ๐2 ) ≤ ๐ง๐ผ⁄2 ) = 1 − ๐ผ ๐(๐ฬ 1 −๐ฬ 2) ๐ (−๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2 ) ≤ (๐ฬ 1 − ๐ฬ 2 ) − (๐1 − ๐2 ) ≤ ๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2) ) = 1 − ๐ผ ๐ (−๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2 ) − (๐ฬ 1 − ๐ฬ 2 ) ≤ −(๐1 − ๐2 ) ≤ ๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2) − (๐ฬ 1 − ๐ฬ 2 )) = 1 − ๐ผ ๐ (๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2 ) + (๐ฬ 1 − ๐ฬ 2 ) ≥ (๐1 − ๐2 ) ≥ −๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2) + (๐ฬ 1 − ๐ฬ 2 )) = 1 − ๐ผ ๐ ((๐ฬ 1 − ๐ฬ 2 ) − ๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2 ) ≤ (๐1 − ๐2 ) ≤ (๐ฬ 1 − ๐ฬ 2 ) + ๐ง๐ผ⁄2 ๐(๐ฬ 1 −๐ฬ 2 ) ) = 1 − ๐ผ where ๐(๐ฬ 1 −๐ฬ 2) = √ ๐1 (1 − ๐1 ) ๐2 (1 − ๐2 ) + ๐1 ๐2 Therefore, a (1 − ๐ผ) × 100% confidence interval is given by (๐ฬ 1 − ๐ฬ 2 ) ± ๐ง๐ผ⁄ √ 2 ๐1 (1 − ๐1 ) ๐2 (1 − ๐2 ) + ๐1 ๐2 Example: A tax preparation firm is interested in comparing the quality of work at two of its regional offices. By randomly selecting samples of tax returns prepared at each office and verifying the sample returns’ accuracy, the firm will be able to estimate the proportion of erroneous returns prepared at each office. Of particular interest is the difference between these proportions. From Office 1 a sample of 250 had 35 returns with errors and from office 2, a sample of 300 had 27 returns with errors. A 90% confidence interval for the difference between the two proportions: Copyright Reserved 81 Hypothesis tests about ๐๐ − ๐๐ Under the assumption ๐ป0 is true as an equality, the population proportions are equal and ๐1 = ๐2 = ๐ and the standard error becomes: ๐(๐ฬ 1 −๐ฬ 2 ) = √ ๐1 (1 − ๐1 ) ๐2 (1 − ๐2 ) 1 1 + = √๐(1 − ๐) ( + ) ๐1 ๐2 ๐1 ๐2 With ๐ unknown we pool, or combine, the point estimators from the two samples to obtain a single point estimator of ๐ as follows: ๐ฬ = ๐1 ๐ฬ 1 + ๐2 ๐ฬ 2 ๐1 + ๐2 The test statistic: ๐ง= (๐ฬ 1 − ๐ฬ 2 ) 1 1 ๐1 + ๐2 ) √๐ฬ (1 − ๐ฬ ) ( Example: Copyright Reserved 82 Chapter 10 Self Evaluation Questions Questions 1 to 4 are based on the following information: Consider the following Excel spreadsheets with data for two independent random samples taken from two normal populations. Use ๐ผ = 0.05 to test the hypothesis that the population mean of sample 1 is greater than that of sample 2. Formula worksheet Value worksheet 1. The expected value of ๐ฬ 1 − ๐ฬ 2 under the null hypothesis is: a) b) c) d) e) 0 0.025 0.49 2 0.05 2. The hypothesis tested here is: 3. The value of the test statistic is: 4. The critical value is: Copyright Reserved 83 Questions 5 to 9 are based on the following information: Two independent samples taken from different departments show the average number of hours that lecturers spend on campus. XYZ University wants to test if the sample means are significantly different at a 10% level of significance. Department 1 Department 2 ๐1 = 4 ๐ฅ1 = 9.25 ๐ 1 = 2.87 ๐2 = 5 ๐ฅ2 = 6.6 ๐ 2 = 1.95 Assume: Normal populations and the degrees of freedom for the t-test is 5. 5. The hypothesis tested here is: 6. The value of the test statistic is: 7. The p-value is in the interval: 8. Reject the null hypothesis at the 10% level of significance, if: 9. It can be concluded at the 10% level of significance that ๐ป0 is: a) b) c) d) e) rejected therefore sample sizes differ. not rejected therefore population means do not differ. not rejected therefore population standard deviations differ. rejected therefore population means differ. rejected therefore population standard deviations do not differ. Copyright Reserved 84 Chapter 11: Statistical inferences about two population variances 11.2 The difference between two population variances In chapter 10 we performed hypothesis tests for the difference of two population means. In order to choose an appropriate testing procedure, a number of assumptions had to be considered. Firstly, the two samples could be independent or dependent (matched pairs test). For independent samples, the variances could be known (a ๐-test is performed) or unknown (a ๐-test is performed). For the variance unknown case, a further assumption is needed in order to choose an appropriate test, namely whether or not the two population variances are equal. This section deals with that assumption. ๐บ๐ Sampling distribution of ๐บ๐๐ when ๐๐๐ = ๐๐๐ ๐ Let ๐12 and ๐22 be the sample variances of two independent simple random samples of sizes ๐1 and ๐2 . If the samples were selected from two normal populations with equal variances, the sampling distribution of ๐12 ⁄2 ๐2 is an ๐น distribution with ๐1 − 1 degrees of freedom for the numerator and ๐2 − 1 degrees of freedom for the denominator. Different types of hypotheses One-tailed test Lower tail test Upper tail test Hypotheses Test statistic Rejection rule: Critical value approach Two-tailed test ๐ป0 : ๐12 = ๐22 ๐ป0 : ๐12 = ๐22 ๐ป0 : ๐12 = ๐22 ๐ป๐ : ๐12 < ๐22 or ๐12 ๐ป๐ : 2 < 1 ๐2 ๐ป๐ : ๐12 > ๐22 or ๐12 ๐ป๐ : 2 > 1 ๐2 ๐ป๐ : ๐12 ≠ ๐22 or ๐12 ๐ป๐ : 2 ≠ 1 ๐2 ๐12 ๐น= 2 ๐2 Distribution of test statistic under ๐ฏ๐ ๐น(๐1 − 1, ๐2 − 1) Reject ๐ป0 if ๐ ≤ ๐น๐1 −1,๐2 −1;1−๐ผ i.e. if 1 ๐≤ ๐น๐2 −1,๐1 −1;๐ผ Reject ๐ป0 if ๐ ≤ ๐น๐1 −1,๐2 −1;1−๐ผ⁄2 Reject ๐ป0 if ๐ ≥ ๐น๐1 −1,๐2−1;๐ผ ๐≤ i.e. if 1 ๐น๐2 −1,๐1 −1;๐ผ⁄2 or if ๐ ≥ ๐น๐1 −1,๐2 −1;๐ผ⁄2 p-value calculation ๐(๐น ≤ ๐) ๐(๐น ≥ ๐) Rejection rule: p-value approach Reject ๐ป0 if p-value ≤ ๐ผ Reject ๐ป0 if p-value ≤ ๐ผ ๐๐๐ { 2๐(๐น ≤ ๐), } 2๐(๐น ≥ ๐) Reject ๐ป0 if p-value ≤ ๐ผ Example: Copyright Reserved 85 It is well known that the average stopping distance of vehicles is larger on a wet surface than on a dry surface. A student would like to test the theory that the variances in these stopping distances differs. A sample of 26 vehicles was tested under wet conditions leading to a sample variance of 48, whilst a sample of 16 vehicles was tested under dry conditions, leading to a sample variance of 20. You may assume that the two samples are from a normal distribution. Use a 5% level of significance to conduct your test. Solution: Given: ๐๐ค = 26 ๐๐ = 16 ๐ ๐ค2 = 48 ๐ ๐2 = 20 Hypotheses: ๐ป0 : ๐๐ค2 = ๐๐2 ๐ป๐ : ๐๐ค2 ≠ ๐๐2 Test statistic: ๐ ๐ค2 48 ๐= 2= = 2.40 ๐ ๐ 20 Rejection rule: Numerator degrees of freedom = 26 − 1 = 25 Denominator degrees of freedom = 16 − 1 = 15 p-value: Using Excel: ๐ − ๐ฃ๐๐๐ข๐ = 2 × ๐(๐น > 2.40) ≈ 0.0812 From the probability tables: Since 2.28 < (๐ = 2.40) < 2.69 0.025 < ๐(๐น ≥ 2.40) < 0.05 ∴ 2 × 0.025 < 2 × ๐(๐น ≥ 2.40) < 2 × 0.05 ∴ 0.05 < ๐ − ๐ฃ๐๐๐ข๐ < 0.10 Decision: Do not reject ๐ป0 Conclusion: At a 5% level of significance we do not have enough evidence to conclude that the variance in stopping times on wet and dry surfaces differ. Copyright Reserved 86 Hypothesis Testing Summary H 0 : ๏ญ ๏ฝ ๏ญ0 Is this hypothesis testing about: H 0 : p ๏ฝ p0 H 0 : p1 ๏ฝ p2 H 0 : ๏ญ1 ๏ฝ ๏ญ 2 H 0 : ๏ญd ๏ฝ 0 A population mean A population proportion p ๏ฑ z๏ก / 2 z๏ฝ x ๏ฑ z๏ก 2 z๏ฝ n x ๏ญ ๏ญ0 ๏ณ n s x ๏ฑ t๏ก 2 t๏ฝ p (1 ๏ญ p ) n p1 ๏ญ p2 ๏ฑ z๏ก / 2 p ๏ญ p0 p0 (1 ๏ญ p0 ) n 2 ๏ณ (๏ณ 2 ) known ๏ณ (๏ณ ) unknown ๏ณ The difference between two population means n x ๏ญ ๏ญ0 s n Degrees of freedom(d.o.f.) =n-1 z๏ฝ ๏ณ1, ๏ณ 2 known ๏ณ12 ๏ณ 22 ๏ซ 2 n1 n2 x1 ๏ญ x 2 X1 ๏ญ X 2 ๏ฑ z๏ก z๏ฝ ๏ณ 12 ๏ณ 22 ๏ซ n1 n2 The difference between two population proportions p1 (1 ๏ญ p1 ) p2 (1 ๏ญ p2 ) ๏ซ n1 n2 p1 ๏ญ p2 n1 p1 ๏ซ n2 p2 n1 ๏ซ n2 ๏ฆ1 1๏ถ p (1 ๏ญ p )๏ง๏ง ๏ซ ๏ท๏ท ๏จ n1 n2 ๏ธ ๏ณ 1 , ๏ณ 2 unknown ๏ณ1 ๏ฝ ๏ณ 2 ( y ๏ญ y ) ๏ญ ( D0 ) t๏ฝ 1 2 1 1 sp ๏ซ n1 n2 s 2p p๏ฝ (n1 ๏ญ 1) s12 ๏ซ (n2 ๏ญ 1) s22 ๏ฝ n1 ๏ซ n2 ๏ญ 2 d.o.f.=n1+n2-2 paired data t๏ฝ d ๏ญ ๏ญd sd n Copyright Reserved 87 Hypothesis Testing Tree Diagram One Sample Two Samples ๐ ๐ฬ − ๐0 ๐ ๐= ๐ ๐๐๐๐๐ ๐ฬ − ๐0 ๐= ๐ ~๐(0,1) √๐ ๐ ๐ฅฬ ± ๐ง๐ผ⁄ 2 ๐ √ ๐ ๐๐๐๐๐๐๐ Assume ๐~ฬ๐(๐, ๐ 2 ) ๐ฬ − ๐0 ๐= ~๐ก(๐ − 1) ๐ √๐ ๐ ๐ฅฬ ± ๐ก๐ผ⁄ 2 ๐ √ √๐0 (1 − ๐0 ) ๐ ๐๐ − ๐๐ Means ~ฬ๐(0,1) ๐= ๐ฬ (1 − ๐ฬ ) ๐ฬ ± ๐ง๐ผ⁄2 √ ๐ 1 1 √๐ฬ (1 − ๐ฬ ) ( + ) ๐1 ๐2 where ๐ฬ = ~ฬ๐(0,1) ๐1 ๐ฬ 1 +๐2 ๐ฬ 2 ๐1 +๐2 ๐ฬ 1 (1 − ๐ฬ 1 ) ๐ฬ 2 (1 − ๐ฬ 2 ) (๐ฬ 1 − ๐ฬ 2 ) ± ๐ง๐ผ⁄ √ + 2 ๐1 ๐2 Dependent samples (๐๐ซ ) Independent samples (๐๐ − ๐๐ ) Matched pairs Assume ๐ท~ฬ๐(๐๐ท , ๐๐ท2 ) Then ๐= ฬ −๐๐ท,0 ๐ท ~๐ก(๐ ๐๐ ⁄ √๐ ๐๐ , ๐๐ ๐๐๐๐๐ ๐= ๐ฬ 1 − ๐ฬ 2 (๐ฬ 1 − ๐ฬ 2 ) − ๐ท0 ~๐(0,1) ๐2 ๐2 √ 1 + 2 ๐1 ๐2 ๐2 ๐2 (๐ฬ 1 − ๐ฬ 2 ) ± ๐ง๐ผ⁄ √ 1 + 2 2 ๐ ๐2 1 ๐๐ , ๐๐ ๐๐๐๐๐๐๐ − 1) ๐๐ , ๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐๐ ๐= (๐ฬ 1 − ๐ฬ 2 ) − ๐ท0 1 1 ๐๐ √ + ๐1 ๐2 ~๐ก(๐๐) (๐1 − 1)๐12 + (๐2 − 1)๐22 ๐๐2 = ๐1 + ๐2 − 2 ๐๐ = ๐1 + ๐2 − 2 ๐๐ , ๐๐ ๐๐๐๐๐๐๐ ๐๐๐๐๐๐๐ (๐ฬ 1 − ๐ฬ 2 ) − ๐ท0 ๐= ~๐ก(๐๐) 2 2 ๐ ๐ √ 1+ 2 ๐1 ๐2 2 ๐๐ = ๐2 ๐2 (๐1 + ๐2 ) 1 Reserved 2 Copyright 2 882 ๐2 ๐2 1 1 ( 1) + ( 2) ๐1 − 1 ๐1 ๐2 − 1 ๐2 TABLES Cumulative probabilities for the standard normal distribution Cumulative probability z z -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0.0 .00 .0013 .0019 .0026 .0035 .0047 .0062 .0082 .0107 .0139 .0179 .0228 .0287 .0359 .0446 .0548 .0668 .0808 .0968 .1151 .1357 .1587 .1841 .2119 .2420 .2743 .3085 .3446 .3821 .4207 .4602 .5000 .01 .0013 .0018 .0025 .0034 .0045 .0060 .0080 .0104 .0136 .0174 .0222 .0281 .0351 .0436 .0537 .0655 .0793 .0951 .1131 .1335 .1562 .1814 .2090 .2389 .2709 .3050 .3409 .3783 .4168 .4562 .4960 0 .02 .0013 .0018 .0024 .0033 .0044 .0059 .0078 .0102 .0132 .0170 .0217 .0274 .0344 .0427 .0526 .0643 .0778 .0934 .1112 .1314 .1539 .1788 .2061 .2358 .2676 .3015 .3372 .3745 .4129 .4522 .4920 .03 .0012 .0017 .0023 .0032 .0043 .0057 .0075 .0099 .0129 .0166 .0212 .0268 .0336 .0418 .0516 .0630 .0764 .0918 .1093 .1292 .1515 .1762 .2033 .2327 .2643 .2981 .3336 .3707 .4090 .4483 .4880 .04 .0012 .0016 .0023 .0031 .0041 .0055 .0073 .0096 .0125 .0162 .0207 .0262 .0329 .0409 .0505 .0618 .0749 .0901 .1075 .1271 .1492 .1736 .2005 .2296 .2611 .2946 .3300 .3669 .4052 .4443 .4840 .05 .0011 .0016 .0022 .0030 .0040 .0054 .0071 .0094 .0122 .0158 .0202 .0256 .0322 .0401 .0495 .0606 .0735 .0885 .1056 .1251 .1469 .1711 .1977 .2266 .2578 .2912 .3264 .3632 .4013 .4404 .4801 .06 .0011 .0015 .0021 .0029 .0039 .0052 .0069 .0091 .0119 .0154 .0197 .0250 .0314 .0392 .0485 .0594 .0721 .0869 .1038 .1230 .1446 .1685 .1949 .2236 .2546 .2877 .3228 .3594 .3974 .4364 .4761 .07 .0011 .0015 .0021 .0028 .0038 .0051 .0068 .0089 .0116 .0150 .0192 .0244 .0307 .0384 .0475 .0582 .0708 .0853 .1020 .1210 .1423 .1660 .1922 .2206 .2514 .2843 .3192 .3557 .3936 .4325 .4721 .08 .0010 .0014 .0020 .0027 .0037 .0049 .0066 .0087 .0113 .0146 .0188 .0239 .0301 .0375 .0465 .0571 .0694 .0838 .1003 .1190 .1401 .1635 .1894 .2177 .2483 .2810 .3156 .3520 .3897 .4286 .4681 .09 .0010 .0014 .0019 .0026 .0036 .0048 .0064 .0084 .0110 .0143 .0183 .0233 .0294 .0367 .0455 .0559 .0681 .0823 .0985 .1170 .1379 .1611 .1867 .2148 .2451 .2776 .3121 .3483 .3859 .4247 .4641 Copyright Reserved 89 Cumulative probabilities for the standard normal distribution Cumulative probability 0 z .0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 .00 .5000 .5398 .5793 .6179 .6554 .6915 .7257 .7580 .7881 .8159 .8413 .8643 .8849 .9032 .9192 .9332 .9452 .9554 .9641 .9713 .9772 .9821 .9861 .9893 .9918 .9938 .9953 .9965 .9974 .9981 .9987 .01 .5040 .5438 .5832 .6217 .6591 .6950 .7291 .7611 .7910 .8186 .8438 .8665 .8869 .9049 .9207 .9345 .9463 .9564 .9649 .9719 .9778 .9826 .9864 .9896 .9920 .9940 .9955 .9966 .9975 .9982 .9987 .02 .5080 .5478 .5871 .6255 .6628 .6985 .7324 .7642 .7939 .8212 .8461 .8686 .8888 .9066 .9222 .9357 .9474 .9573 .9656 .9726 .9783 .9830 .9868 .9898 .9922 .9941 .9956 .9967 .9976 .9982 .9987 z .03 .5120 .5517 .5910 .6293 .6664 .7019 .7357 .7673 .7967 .8238 .8485 .8708 .8907 .9082 .9236 .9370 .9484 .9582 .9664 .9732 .9788 .9834 .9871 .9901 .9925 .9943 .9957 .9968 .9977 .9983 .9988 .04 .5160 .5557 .5948 .6331 .6700 .7054 .7389 .7704 .7995 .8264 .8508 .8729 .8925 .9099 .9251 .9382 .9495 .9591 .9671 .9738 .9793 .9838 .9875 .9904 .9927 .9945 .9959 .9969 .9977 .9984 .9988 .05 .5199 .5596 .5987 .6368 .6736 .7088 .7422 .7734 .8023 .8289 .8531 .8749 .8944 .9115 .9265 .9394 .9505 .9599 .9678 .9744 .9798 .9842 .9878 .9906 .9929 .9946 .9960 .9970 .9978 .9984 .9989 .06 .5239 .5636 .6026 .6406 .6772 .7123 .7454 .7764 .8051 .8315 .8554 .8770 .8962 .9131 .9279 .9406 .9515 .9608 .9686 .9750 .9803 .9846 .9881 .9909 .9931 .9948 .9961 .9971 .9979 .9985 .9989 .07 .5279 .5675 .6064 .6443 .6808 .7157 .7486 .7794 .8078 .8340 .8577 .8790 .8980 .9147 .9292 .9418 .9525 .9616 .9693 .9756 .9808 .9850 .9884 .9911 .9932 .9949 .9962 .9972 .9979 .9985 .9989 .08 .5319 .5714 .6103 .6480 .6844 .7190 .7517 .7823 .8106 .8365 .8599 .8810 .8997 .9162 .9306 .9429 .9535 .9625 .9699 .9761 .9812 .9854 .9887 .9913 .9934 .9951 .9963 .9973 .9980 .9986 .9990 .09 .5359 .5753 .6141 .6517 .6879 .7224 .7549 .7852 .8133 .8389 .8621 .8830 .9015 .9177 .9319 .9441 .9545 .9633 .9706 .9767 .9817 .9857 .9890 .9916 .9936 .9952 .9964 .9974 .9981 .9986 .9990 Copyright Reserved 90 t – distribution tables: ๏ท ๏ท Area or Probability Symmetric around 0. Degrees of freedom (df) = n – 1. 0 Degrees of freedom 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 t Area in Upper Tail 0.20 1.376 1.061 0.978 0.941 0.920 0.906 0.896 0.889 0.883 0.879 0.876 0.873 0.870 0.868 0.866 0.865 0.863 0.862 0.861 0.860 0.859 0.858 0.858 0.857 0.856 0.856 0.855 0.855 0.854 0.854 0.853 0.853 0.853 0.852 0.852 0.852 0.851 0.851 0.851 0.851 0.850 0.850 0.850 0.850 0.850 0.850 0.849 0.849 0.849 0.849 0.10 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.309 1.309 1.308 1.307 1.306 1.306 1.305 1.304 1.304 1.303 1.303 1.302 1.302 1.301 1.301 1.300 1.300 1.299 1.299 1.299 0.05 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.696 1.694 1.692 1.691 1.690 1.688 1.687 1.686 1.685 1.684 1.683 1.682 1.681 1.680 1.679 1.679 1.678 1.677 1.677 1.676 0.025 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.040 2.037 2.035 2.032 2.030 2.028 2.026 2.024 2.023 2.021 2.020 2.018 2.017 2.015 2.014 2.013 2.012 2.011 2.010 2.009 0.01 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.583 2.567 2.552 2.539 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.457 2.453 2.449 2.445 2.441 2.438 2.434 2.431 2.429 2.426 2.423 2.421 2.418 2.416 2.414 2.412 2.410 2.408 2.407 2.405 2.403 0.005 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.744 2.738 2.733 2.728 2.724 2.719 2.715 2.712 2.708 2.704 2.701 2.698 2.695 2.692 2.690 2.687 2.685 2.682 2.680 2.678 Copyright Reserved 91 t distribution (Continued) Degrees of freedom 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 ∞ Area in Upper Tail 0.20 0.10 0.05 0.025 0.01 0.005 0.849 0.849 0.848 0.848 0.848 0.848 0.848 0.848 0.848 0.848 0.848 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.847 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.846 0.845 0.845 0.845 0.845 0.845 0.845 0.845 0.842 1.298 1.298 1.298 1.297 1.297 1.297 1.297 1.296 1.296 1.296 1.296 1.295 1.295 1.295 1.295 1.295 1.294 1.294 1.294 1.294 1.294 1.293 1.293 1.293 1.293 1.293 1.293 1.292 1.292 1.292 1.292 1.292 1.292 1.292 1.292 1.291 1.291 1.291 1.291 1.291 1.291 1.291 1.291 1.291 1.291 1.290 1.290 1.290 1.290 1.290 1.282 1.675 1.675 1.674 1.674 1.673 1.673 1.672 1.672 1.671 1.671 1.670 1.670 1.669 1.669 1.669 1.668 1.668 1.668 1.667 1.667 1.667 1.666 1.666 1.666 1.665 1.665 1.665 1.665 1.664 1.664 1.664 1.664 1.663 1.663 1.663 1.663 1.663 1.662 1.662 1.662 1.662 1.662 1.661 1.661 1.661 1.661 1.661 1.661 1.660 1.660 1.645 2.008 2.007 2.006 2.005 2.004 2.003 2.002 2.002 2.001 2.000 2.000 1.999 1.998 1.998 1.997 1.997 1.996 1.995 1.995 1.994 1.994 1.993 1.993 1.993 1.992 1.992 1.991 1.991 1.990 1.990 1.990 1.989 1.989 1.989 1.988 1.988 1.988 1.987 1.987 1.987 1.986 1.986 1.986 1.986 1.985 1.985 1.985 1.984 1.984 1.984 1.960 2.402 2.400 2.399 2.397 2.396 2.395 2.394 2.392 2.391 2.390 2.389 2.388 2.387 2.386 2.385 2.384 2.383 2.382 2.382 2.381 2.380 2.379 2.379 2.378 2.377 2.376 2.376 2.375 2.374 2.374 2.373 2.373 2.372 2.372 2.371 2.370 2.370 2.369 2.369 2.368 2.368 2.368 2.367 2.367 2.366 2.366 2.365 2.365 2.365 2.364 2.326 2.676 2.674 2.672 2.670 2.668 2.667 2.665 2.663 2.662 2.660 2.659 2.657 2.656 2.655 2.654 2.652 2.651 2.650 2.649 2.648 2.647 2.646 2.645 2.644 2.643 2.642 2.641 2.640 2.639 2.639 2.638 2.637 2.636 2.636 2.635 2.634 2.634 2.633 2.632 2.632 2.631 2.630 2.630 2.629 2.629 2.628 2.627 2.627 2.626 2.626 2.576 Note: As the sample size increases (and therefore the degrees of freedom increase), the t-values converge to the z-values for corresponding levels of ๐ผ. Copyright Reserved 92 Probability tables for the F distribution Area / Upper probability Denominator df (๐๐ ) Entries in the table give ๐น๐1 ,๐2 ; ๐ผ values, where ๐ผ is the area or probability in the upper tail of the F distribution. Numerator df (๐๐ ) 1 2 3 4 5 6 7 8 9 10 15 20 25 30 40 60 100 1000 1 0.1 0.05 0.025 0.01 39.86 161.45 647.79 4052.18 49.50 199.50 799.50 4999.50 53.59 215.71 864.16 5403.35 55.83 224.58 899.58 5624.58 57.24 230.16 921.85 5763.65 58.20 233.99 937.11 5858.99 58.91 236.77 948.22 5928.36 59.44 238.88 956.66 5981.07 59.86 240.54 963.28 6022.47 60.19 241.88 968.63 6055.85 61.22 245.95 984.87 6157.28 61.74 248.01 993.10 6208.73 62.05 249.26 998.08 6239.83 62.26 250.10 1001.41 6260.65 62.53 251.14 1005.60 6286.78 62.79 252.20 1009.80 6313.03 63.01 253.04 1013.17 6334.11 63.30 254.19 1017.75 6362.68 2 0.1 0.05 0.025 0.01 8.53 18.51 38.51 98.50 9.00 19.00 39.00 99.00 9.16 19.16 39.17 99.17 9.24 19.25 39.25 99.25 9.29 19.30 39.30 99.30 9.33 19.33 39.33 99.33 9.35 19.35 39.36 99.36 9.37 19.37 39.37 99.37 9.38 19.38 39.39 99.39 9.39 19.40 39.40 99.40 9.42 19.43 39.43 99.43 9.44 19.45 39.45 99.45 9.45 19.46 39.46 99.46 9.46 19.46 39.46 99.47 9.47 19.47 39.47 99.47 9.47 19.48 39.48 99.48 9.48 19.49 39.49 99.49 9.49 19.49 39.50 99.50 3 0.1 0.05 0.025 0.01 5.54 10.13 17.44 34.12 5.46 9.55 16.04 30.82 5.39 9.28 15.44 29.46 5.34 9.12 15.10 28.71 5.31 9.01 14.88 28.24 5.28 8.94 14.73 27.91 5.27 8.89 14.62 27.67 5.25 8.85 14.54 27.49 5.24 8.81 14.47 27.35 5.23 8.79 14.42 27.23 5.20 8.70 14.25 26.87 5.18 8.66 14.17 26.69 5.17 8.63 14.12 26.58 5.17 8.62 14.08 26.50 5.16 8.59 14.04 26.41 5.15 8.57 13.99 26.32 5.14 8.55 13.96 26.24 5.13 8.53 13.91 26.14 4 0.1 0.05 0.025 0.01 4.54 7.71 12.22 21.20 4.32 6.94 10.65 18.00 4.19 6.59 9.98 16.69 4.11 6.39 9.60 15.98 4.05 6.26 9.36 15.52 4.01 6.16 9.20 15.21 3.98 6.09 9.07 14.98 3.95 6.04 8.98 14.80 3.94 6.00 8.90 14.66 3.92 5.96 8.84 14.55 3.87 5.86 8.66 14.20 3.84 5.80 8.56 14.02 3.83 5.77 8.50 13.91 3.82 5.75 8.46 13.84 3.80 5.72 8.41 13.75 3.79 5.69 8.36 13.65 3.78 5.66 8.32 13.58 3.76 5.63 8.26 13.47 5 0.1 0.05 0.025 0.01 4.06 6.61 10.01 16.26 3.78 5.79 8.43 13.27 3.62 5.41 7.76 12.06 3.52 5.19 7.39 11.39 3.45 5.05 7.15 10.97 3.40 4.95 6.98 10.67 3.37 4.88 6.85 10.46 3.34 4.82 6.76 10.29 3.32 4.77 6.68 10.16 3.30 4.74 6.62 10.05 3.24 4.62 6.43 9.72 3.21 4.56 6.33 9.55 3.19 4.52 6.27 9.45 3.17 4.50 6.23 9.38 3.16 4.46 6.18 9.29 3.14 4.43 6.12 9.20 3.13 4.41 6.08 9.13 3.11 4.37 6.02 9.03 6 0.1 0.05 0.025 0.01 3.78 5.99 8.81 13.75 3.46 5.14 7.26 10.92 3.29 4.76 6.60 9.78 3.18 4.53 6.23 9.15 3.11 4.39 5.99 8.75 3.05 4.28 5.82 8.47 3.01 4.21 5.70 8.26 2.98 4.15 5.60 8.10 2.96 4.10 5.52 7.98 2.94 4.06 5.46 7.87 2.87 3.94 5.27 7.56 2.84 3.87 5.17 7.40 2.81 3.83 5.11 7.30 2.80 3.81 5.07 7.23 2.78 3.77 5.01 7.14 2.76 3.74 4.96 7.06 2.75 3.71 4.92 6.99 2.72 3.67 4.86 6.89 Copyright Reserved 93 Area / Upper probability Denominator df (๐๐ ) 7 0.1 0.05 0.025 0.01 3.59 5.59 8.07 12.25 3.26 4.74 6.54 9.55 3.07 4.35 5.89 8.45 2.96 4.12 5.52 7.85 2.88 3.97 5.29 7.46 2.83 3.87 5.12 7.19 2.78 3.79 4.99 6.99 2.75 3.73 4.90 6.84 2.72 3.68 4.82 6.72 2.70 3.64 4.76 6.62 2.63 3.51 4.57 6.31 2.59 3.44 4.47 6.16 2.57 3.40 4.40 6.06 2.56 3.38 4.36 5.99 2.54 3.34 4.31 5.91 2.51 3.30 4.25 5.82 2.50 3.27 4.21 5.75 2.47 3.23 4.15 5.66 Numerator df (๐๐ ) 1 2 3 4 5 6 7 8 9 10 15 20 25 30 40 60 100 1000 8 0.1 0.05 0.025 0.01 3.46 5.32 7.57 11.26 3.11 4.46 6.06 8.65 2.92 4.07 5.42 7.59 2.81 3.84 5.05 7.01 2.73 3.69 4.82 6.63 2.67 3.58 4.65 6.37 2.62 3.50 4.53 6.18 2.59 3.44 4.43 6.03 2.56 3.39 4.36 5.91 2.54 3.35 4.30 5.81 2.46 3.22 4.10 5.52 2.42 3.15 4.00 5.36 2.40 3.11 3.94 5.26 2.38 3.08 3.89 5.20 2.36 3.04 3.84 5.12 2.34 3.01 3.78 5.03 2.32 2.97 3.74 4.96 2.30 2.93 3.68 4.87 9 0.1 0.05 0.025 0.01 3.36 5.12 7.21 10.56 3.01 4.26 5.71 8.02 2.81 3.86 5.08 6.99 2.69 3.63 4.72 6.42 2.61 3.48 4.48 6.06 2.55 3.37 4.32 5.80 2.51 3.29 4.20 5.61 2.47 3.23 4.10 5.47 2.44 3.18 4.03 5.35 2.42 3.14 3.96 5.26 2.34 3.01 3.77 4.96 2.30 2.94 3.67 4.81 2.27 2.89 3.60 4.71 2.25 2.86 3.56 4.65 2.23 2.83 3.51 4.57 2.21 2.79 3.45 4.48 2.19 2.76 3.40 4.41 2.16 2.71 3.34 4.32 10 0.1 0.05 0.025 0.01 3.29 4.96 6.94 10.04 2.92 4.10 5.46 7.56 2.73 3.71 4.83 6.55 2.61 3.48 4.47 5.99 2.52 3.33 4.24 5.64 2.46 3.22 4.07 5.39 2.41 3.14 3.95 5.20 2.38 3.07 3.85 5.06 2.35 3.02 3.78 4.94 2.32 2.98 3.72 4.85 2.24 2.85 3.52 4.56 2.20 2.77 3.42 4.41 2.17 2.73 3.35 4.31 2.16 2.70 3.31 4.25 2.13 2.66 3.26 4.17 2.11 2.62 3.20 4.08 2.09 2.59 3.15 4.01 2.06 2.54 3.09 3.92 11 0.1 0.05 0.025 0.01 3.23 4.84 6.72 9.65 2.86 3.98 5.26 7.21 2.66 3.59 4.63 6.22 2.54 3.36 4.28 5.67 2.45 3.20 4.04 5.32 2.39 3.09 3.88 5.07 2.34 3.01 3.76 4.89 2.30 2.95 3.66 4.74 2.27 2.90 3.59 4.63 2.25 2.85 3.53 4.54 2.17 2.72 3.33 4.25 2.12 2.65 3.23 4.10 2.10 2.60 3.16 4.01 2.08 2.57 3.12 3.94 2.05 2.53 3.06 3.86 2.03 2.49 3.00 3.78 2.01 2.46 2.96 3.71 1.98 2.41 2.89 3.61 12 0.1 0.05 0.025 0.01 3.18 4.75 6.55 9.33 2.81 3.89 5.10 6.93 2.61 3.49 4.47 5.95 2.48 3.26 4.12 5.41 2.39 3.11 3.89 5.06 2.33 3.00 3.73 4.82 2.28 2.91 3.61 4.64 2.24 2.85 3.51 4.50 2.21 2.80 3.44 4.39 2.19 2.75 3.37 4.30 2.10 2.62 3.18 4.01 2.06 2.54 3.07 3.86 2.03 2.50 3.01 3.76 2.01 2.47 2.96 3.70 1.99 2.43 2.91 3.62 1.96 2.38 2.85 3.54 1.94 2.35 2.80 3.47 1.91 2.30 2.73 3.37 13 0.1 0.05 0.025 0.01 3.14 4.67 6.41 9.07 2.76 3.81 4.97 6.70 2.56 3.41 4.35 5.74 2.43 3.18 4.00 5.21 2.35 3.03 3.77 4.86 2.28 2.92 3.60 4.62 2.23 2.83 3.48 4.44 2.20 2.77 3.39 4.30 2.16 2.71 3.31 4.19 2.14 2.67 3.25 4.10 2.05 2.53 3.05 3.82 2.01 2.46 2.95 3.66 1.98 2.41 2.88 3.57 1.96 2.38 2.84 3.51 1.93 2.34 2.78 3.43 1.90 2.30 2.72 3.34 1.88 2.26 2.67 3.27 1.85 2.21 2.60 3.18 14 0.1 0.05 0.025 0.01 3.10 4.60 6.30 8.86 2.73 3.74 4.86 6.51 2.52 3.34 4.24 5.56 2.39 3.11 3.89 5.04 2.31 2.96 3.66 4.69 2.24 2.85 3.50 4.46 2.19 2.76 3.38 4.28 2.15 2.70 3.29 4.14 2.12 2.65 3.21 4.03 2.10 2.60 3.15 3.94 2.01 2.46 2.95 3.66 1.96 2.39 2.84 3.51 1.93 2.34 2.78 3.41 1.91 2.31 2.73 3.35 1.89 2.27 2.67 3.27 1.86 2.22 2.61 3.18 1.83 2.19 2.56 3.11 1.80 2.14 2.50 3.02 Copyright Reserved 94 2.70 3.68 4.77 6.36 2.49 3.29 4.15 5.42 2.36 3.06 3.80 4.89 2.27 2.90 3.58 4.56 2.21 2.79 3.41 4.32 2.16 2.71 3.29 4.14 2.12 2.64 3.20 4.00 2.09 2.59 3.12 3.89 2.06 2.54 3.06 3.80 1.97 2.40 2.86 3.52 1.92 2.33 2.76 3.37 1.89 2.28 2.69 3.28 1.87 2.25 2.64 3.21 1.85 2.20 2.59 3.13 1.82 2.16 2.52 3.05 1.79 2.12 2.47 2.98 1.76 2.07 2.40 2.88 16 0.1 0.05 0.025 0.01 3.05 4.49 6.12 8.53 2.67 3.63 4.69 6.23 2.46 3.24 4.08 5.29 2.33 3.01 3.73 4.77 2.24 2.85 3.50 4.44 2.18 2.74 3.34 4.20 2.13 2.66 3.22 4.03 2.09 2.59 3.12 3.89 2.06 2.54 3.05 3.78 2.03 2.49 2.99 3.69 1.94 2.35 2.79 3.41 1.89 2.28 2.68 3.26 1.86 2.23 2.61 3.16 1.84 2.19 2.57 3.10 1.81 2.15 2.51 3.02 1.78 2.11 2.45 2.93 1.76 2.07 2.40 2.86 1.72 2.02 2.32 2.76 Area / Upper probability 3.07 4.54 6.20 8.68 Denominator df (๐๐ ) 15 0.1 0.05 0.025 0.01 Numerator df (๐๐ ) 1 2 3 4 5 6 7 8 9 10 15 20 25 30 40 60 100 1000 17 0.1 0.05 0.025 0.01 3.03 4.45 6.04 8.40 2.64 3.59 4.62 6.11 2.44 3.20 4.01 5.18 2.31 2.96 3.66 4.67 2.22 2.81 3.44 4.34 2.15 2.70 3.28 4.10 2.10 2.61 3.16 3.93 2.06 2.55 3.06 3.79 2.03 2.49 2.98 3.68 2.00 2.45 2.92 3.59 1.91 2.31 2.72 3.31 1.86 2.23 2.62 3.16 1.83 2.18 2.55 3.07 1.81 2.15 2.50 3.00 1.78 2.10 2.44 2.92 1.75 2.06 2.38 2.83 1.73 2.02 2.33 2.76 1.69 1.97 2.26 2.66 18 0.1 0.05 0.025 0.01 3.01 4.41 5.98 8.29 2.62 3.55 4.56 6.01 2.42 3.16 3.95 5.09 2.29 2.93 3.61 4.58 2.20 2.77 3.38 4.25 2.13 2.66 3.22 4.01 2.08 2.58 3.10 3.84 2.04 2.51 3.01 3.71 2.00 2.46 2.93 3.60 1.98 2.41 2.87 3.51 1.89 2.27 2.67 3.23 1.84 2.19 2.56 3.08 1.80 2.14 2.49 2.98 1.78 2.11 2.44 2.92 1.75 2.06 2.38 2.84 1.72 2.02 2.32 2.75 1.70 1.98 2.27 2.68 1.66 1.92 2.20 2.58 19 0.1 0.05 0.025 0.01 2.99 4.38 5.92 8.18 2.61 3.52 4.51 5.93 2.40 3.13 3.90 5.01 2.27 2.90 3.56 4.50 2.18 2.74 3.33 4.17 2.11 2.63 3.17 3.94 2.06 2.54 3.05 3.77 2.02 2.48 2.96 3.63 1.98 2.42 2.88 3.52 1.96 2.38 2.82 3.43 1.86 2.23 2.62 3.15 1.81 2.16 2.51 3.00 1.78 2.11 2.44 2.91 1.76 2.07 2.39 2.84 1.73 2.03 2.33 2.76 1.70 1.98 2.27 2.67 1.67 1.94 2.22 2.60 1.64 1.88 2.14 2.50 20 0.1 0.05 0.025 0.01 2.97 4.35 5.87 8.10 2.59 3.49 4.46 5.85 2.38 3.10 3.86 4.94 2.25 2.87 3.51 4.43 2.16 2.71 3.29 4.10 2.09 2.60 3.13 3.87 2.04 2.51 3.01 3.70 2.00 2.45 2.91 3.56 1.96 2.39 2.84 3.46 1.94 2.35 2.77 3.37 1.84 2.20 2.57 3.09 1.79 2.12 2.46 2.94 1.76 2.07 2.40 2.84 1.74 2.04 2.35 2.78 1.71 1.99 2.29 2.69 1.68 1.95 2.22 2.61 1.65 1.91 2.17 2.54 1.61 1.85 2.09 2.43 21 0.1 0.05 0.025 0.01 2.96 4.32 5.83 8.02 2.57 3.47 4.42 5.78 2.36 3.07 3.82 4.87 2.23 2.84 3.48 4.37 2.14 2.68 3.25 4.04 2.08 2.57 3.09 3.81 2.02 2.49 2.97 3.64 1.98 2.42 2.87 3.51 1.95 2.37 2.80 3.40 1.92 2.32 2.73 3.31 1.83 2.18 2.53 3.03 1.78 2.10 2.42 2.88 1.74 2.05 2.36 2.79 1.72 2.01 2.31 2.72 1.69 1.96 2.25 2.64 1.66 1.92 2.18 2.55 1.63 1.88 2.13 2.48 1.59 1.82 2.05 2.37 22 0.1 0.05 0.025 0.01 2.95 4.30 5.79 7.95 2.56 3.44 4.38 5.72 2.35 3.05 3.78 4.82 2.22 2.82 3.44 4.31 2.13 2.66 3.22 3.99 2.06 2.55 3.05 3.76 2.01 2.46 2.93 3.59 1.97 2.40 2.84 3.45 1.93 2.34 2.76 3.35 1.90 2.30 2.70 3.26 1.81 2.15 2.50 2.98 1.76 2.07 2.39 2.83 1.73 2.02 2.32 2.73 1.70 1.98 2.27 2.67 1.67 1.94 2.21 2.58 1.64 1.89 2.14 2.50 1.61 1.85 2.09 2.42 1.57 1.79 2.01 2.32 Copyright Reserved 95 2.55 3.42 4.35 5.66 2.34 3.03 3.75 4.76 2.21 2.80 3.41 4.26 2.11 2.64 3.18 3.94 2.05 2.53 3.02 3.71 1.99 2.44 2.90 3.54 1.95 2.37 2.81 3.41 1.92 2.32 2.73 3.30 1.89 2.27 2.67 3.21 1.80 2.13 2.47 2.93 1.74 2.05 2.36 2.78 1.71 2.00 2.29 2.69 1.69 1.96 2.24 2.62 1.66 1.91 2.18 2.54 1.62 1.86 2.11 2.45 1.59 1.82 2.06 2.37 1.55 1.76 1.98 2.27 24 0.1 0.05 0.025 0.01 2.93 4.26 5.72 7.82 2.54 3.40 4.32 5.61 2.33 3.01 3.72 4.72 2.19 2.78 3.38 4.22 2.10 2.62 3.15 3.90 2.04 2.51 2.99 3.67 1.98 2.42 2.87 3.50 1.94 2.36 2.78 3.36 1.91 2.30 2.70 3.26 1.88 2.25 2.64 3.17 1.78 2.11 2.44 2.89 1.73 2.03 2.33 2.74 1.70 1.97 2.26 2.64 1.67 1.94 2.21 2.58 1.64 1.89 2.15 2.49 1.61 1.84 2.08 2.40 1.58 1.80 2.02 2.33 1.54 1.74 1.94 2.22 25 0.1 0.05 0.025 0.01 2.92 4.24 5.69 7.77 2.53 3.39 4.29 5.57 2.32 2.99 3.69 4.68 2.18 2.76 3.35 4.18 2.09 2.60 3.13 3.85 2.02 2.49 2.97 3.63 1.97 2.40 2.85 3.46 1.93 2.34 2.75 3.32 1.89 2.28 2.68 3.22 1.87 2.24 2.61 3.13 1.77 2.09 2.41 2.85 1.72 2.01 2.30 2.70 1.68 1.96 2.23 2.60 1.66 1.92 2.18 2.54 1.63 1.87 2.12 2.45 1.59 1.82 2.05 2.36 1.56 1.78 2.00 2.29 1.52 1.72 1.91 2.18 Area / Upper probability 2.94 4.28 5.75 7.88 Denominator df (๐๐ ) 23 0.1 0.05 0.025 0.01 Numerator df (๐๐ ) 1 2 3 4 5 6 7 8 9 10 15 20 25 30 40 60 100 1000 26 0.1 0.05 0.025 0.01 2.91 4.23 5.66 7.72 2.52 3.37 4.27 5.53 2.31 2.98 3.67 4.64 2.17 2.74 3.33 4.14 2.08 2.59 3.10 3.82 2.01 2.47 2.94 3.59 1.96 2.39 2.82 3.42 1.92 2.32 2.73 3.29 1.88 2.27 2.65 3.18 1.86 2.22 2.59 3.09 1.76 2.07 2.39 2.81 1.71 1.99 2.28 2.66 1.67 1.94 2.21 2.57 1.65 1.90 2.16 2.50 1.61 1.85 2.09 2.42 1.58 1.80 2.03 2.33 1.55 1.76 1.97 2.25 1.51 1.70 1.89 2.14 27 0.1 0.05 0.025 0.01 2.90 4.21 5.63 7.68 2.51 3.35 4.24 5.49 2.30 2.96 3.65 4.60 2.17 2.73 3.31 4.11 2.07 2.57 3.08 3.78 2.00 2.46 2.92 3.56 1.95 2.37 2.80 3.39 1.91 2.31 2.71 3.26 1.87 2.25 2.63 3.15 1.85 2.20 2.57 3.06 1.75 2.06 2.36 2.78 1.70 1.97 2.25 2.63 1.66 1.92 2.18 2.54 1.64 1.88 2.13 2.47 1.60 1.84 2.07 2.38 1.57 1.79 2.00 2.29 1.54 1.74 1.94 2.22 1.50 1.68 1.86 2.11 28 0.1 0.05 0.025 0.01 2.89 4.20 5.61 7.64 2.50 3.34 4.22 5.45 2.29 2.95 3.63 4.57 2.16 2.71 3.29 4.07 2.06 2.56 3.06 3.75 2.00 2.45 2.90 3.53 1.94 2.36 2.78 3.36 1.90 2.29 2.69 3.23 1.87 2.24 2.61 3.12 1.84 2.19 2.55 3.03 1.74 2.04 2.34 2.75 1.69 1.96 2.23 2.60 1.65 1.91 2.16 2.51 1.63 1.87 2.11 2.44 1.59 1.82 2.05 2.35 1.56 1.77 1.98 2.26 1.53 1.73 1.92 2.19 1.48 1.66 1.84 2.08 29 0.1 0.05 0.025 0.01 2.89 4.18 5.59 7.60 2.50 3.33 4.20 5.42 2.28 2.93 3.61 4.54 2.15 2.70 3.27 4.04 2.06 2.55 3.04 3.73 1.99 2.43 2.88 3.50 1.93 2.35 2.76 3.33 1.89 2.28 2.67 3.20 1.86 2.22 2.59 3.09 1.83 2.18 2.53 3.00 1.73 2.03 2.32 2.73 1.68 1.94 2.21 2.57 1.64 1.89 2.14 2.48 1.62 1.85 2.09 2.41 1.58 1.81 2.03 2.33 1.55 1.75 1.96 2.23 1.52 1.71 1.90 2.16 1.47 1.65 1.82 2.05 30 0.1 0.05 0.025 0.01 2.88 4.17 5.57 7.56 2.49 3.32 4.18 5.39 2.28 2.92 3.59 4.51 2.14 2.69 3.25 4.02 2.05 2.53 3.03 3.70 1.98 2.42 2.87 3.47 1.93 2.33 2.75 3.30 1.88 2.27 2.65 3.17 1.85 2.21 2.57 3.07 1.82 2.16 2.51 2.98 1.72 2.01 2.31 2.70 1.67 1.93 2.20 2.55 1.63 1.88 2.12 2.45 1.61 1.84 2.07 2.39 1.57 1.79 2.01 2.30 1.54 1.74 1.94 2.21 1.51 1.70 1.88 2.13 1.46 1.63 1.80 2.02 Copyright Reserved 96 40 0.1 0.05 0.025 0.01 2.84 4.08 5.42 7.31 2.44 3.23 4.05 5.18 2.23 2.84 3.46 4.31 2.09 2.61 3.13 3.83 2.00 2.45 2.90 3.51 1.93 2.34 2.74 3.29 1.87 2.25 2.62 3.12 1.83 2.18 2.53 2.99 1.79 2.12 2.45 2.89 1.76 2.08 2.39 2.80 1.66 1.92 2.18 2.52 1.61 1.84 2.07 2.37 1.57 1.78 1.99 2.27 1.54 1.74 1.94 2.20 1.51 1.69 1.88 2.11 1.47 1.64 1.80 2.02 1.43 1.59 1.74 1.94 1.38 1.52 1.65 1.82 60 0.1 0.05 0.025 0.01 2.79 4.00 5.29 7.08 2.39 3.15 3.93 4.98 2.18 2.76 3.34 4.13 2.04 2.53 3.01 3.65 1.95 2.37 2.79 3.34 1.87 2.25 2.63 3.12 1.82 2.17 2.51 2.95 1.77 2.10 2.41 2.82 1.74 2.04 2.33 2.72 1.71 1.99 2.27 2.63 1.60 1.84 2.06 2.35 1.54 1.75 1.94 2.20 1.50 1.69 1.87 2.10 1.48 1.65 1.82 2.03 1.44 1.59 1.74 1.94 1.40 1.53 1.67 1.84 1.36 1.48 1.60 1.75 1.30 1.40 1.49 1.62 100 0.1 0.05 0.025 0.01 2.76 3.94 5.18 6.90 2.36 3.09 3.83 4.82 2.14 2.70 3.25 3.98 2.00 2.46 2.92 3.51 1.91 2.31 2.70 3.21 1.83 2.19 2.54 2.99 1.78 2.10 2.42 2.82 1.73 2.03 2.32 2.69 1.69 1.97 2.24 2.59 1.66 1.93 2.18 2.50 1.56 1.77 1.97 2.22 1.49 1.68 1.85 2.07 1.45 1.62 1.77 1.97 1.42 1.57 1.71 1.89 1.38 1.52 1.64 1.80 1.34 1.45 1.56 1.69 1.29 1.39 1.48 1.60 1.22 1.30 1.36 1.45 1000 0.1 0.05 0.025 0.01 2.71 3.85 5.04 6.66 2.31 3.00 3.70 4.63 2.09 2.61 3.13 3.80 1.95 2.38 2.80 3.34 1.85 2.22 2.58 3.04 1.78 2.11 2.42 2.82 1.72 2.02 2.30 2.66 1.68 1.95 2.20 2.53 1.64 1.89 2.13 2.43 1.61 1.84 2.06 2.34 1.49 1.68 1.85 2.06 1.43 1.58 1.72 1.90 1.38 1.52 1.64 1.79 1.35 1.47 1.58 1.72 1.30 1.41 1.50 1.61 1.25 1.33 1.41 1.50 1.20 1.26 1.32 1.38 1.08 1.11 1.13 1.16 Copyright Reserved 97 WST143 Formula list ๐ 1 ๐(๐ฅ) = { ๐ − ๐ , 0 , ๐ธ(๐) = ๐≤๐ฅ≤๐ ๐ฬ = ๐๐๐ ๐๐คโ๐๐๐ 1 ∑ ๐๐ ๐ ๐=1 ๐+๐ 2 ๐ฃ๐๐(๐) = ๐ธ(๐ฬ ) = ๐ ๐๐ฬ = ๐ธ(๐ฬ ) = ๐ ๐๐ฬ = √ ๐ฬ − ๐ ๐๐ฬ ๐= ๐= (๐ − ๐)2 12 ๐ √๐ ๐(1 − ๐) ๐ ๐ฬ − ๐ ๐๐ฬ Area of triangle = 0.5(base)(perpendicular height) Area of rectangle = (base)(height) ๐ธ(๐๐ + ๐) = ๐๐ธ(๐) + ๐ ๐ฃ๐๐(๐๐ + ๐) = ๐2 ๐ฃ๐๐(๐) ๐ฬ ± ๐ง๐ผ⁄2 ๐ฬ ± ๐ก๐ผ⁄2 ๐ ๐ฬ = √๐ ๐ ๐ ๐ ๐ฬ (1 − ๐ฬ ) ๐ฬ ± ๐ง๐ผ⁄2 √ ๐ √๐ ๐12 ๐22 ๐ฬ 1 − ๐ฬ 2 ± ๐ง๐ผ⁄2 √ + ๐1 ๐2 ๐1 (1 − ๐1 ) ๐2 (1 − ๐2 ) ๐ฬ 1 − ๐ฬ 2 ± ๐ง๐ผ⁄2 √ + ๐1 ๐2 ๐ฬ − ๐0 ๐= ๐ ⁄ ๐ √ ๐ง๐ผ⁄ ๐ 2 ๐=( 2 ) ๐ธ ๐= ๐= ๐ฬ − ๐0 ๐⁄ √๐ ๐= ๐ฬ − ๐0 √๐0 (1 − ๐0 ) ๐ ๐ฬ 1 − ๐ฬ 2 − ๐ท0 ๐2 √ 1 ๐1 ๐= + ๐= ๐22 ๐2 ๐ฬ 1 − ๐ฬ 2 − ๐ท0 ๐๐ √ ๐= ๐คโ๐๐๐ 1 1 + ๐1 ๐2 ๐ฬ 1 − ๐ฬ 2 √๐ฬ (1 − ๐ฬ ) ( 1 1 + ) ๐1 ๐2 ๐๐2 = ๐คโ๐๐๐ ฬ − ๐๐ท ๐ท ๐๐ ⁄ √๐ (๐1 − 1)๐12 + (๐2 − 1)๐22 ๐1 + ๐2 − 2 ๐ฬ = ๐1 ๐ฬ 1 + ๐2 ๐ฬ 2 ๐1 + ๐2 Copyright Reserved 98 Optimisation Techniques Supplemental Material Reference: Swanepoel A, Vivier F, Millard SM and Ehlers R, Quantitaive Statistical Techniques (Van Schaiks, 3rd Edition, 2009) Please note that the notes supplied to you for this section of your Module are compiled from the above mentioned source Chapter 2: Differentiation 2.1 – 2.3 Functions, Limits & Continuity Class discussion only 2.4 Rates of change Rate of change (RC) โ๐ฆ ๐ ๐ถ = โ๐ฅ Two types ๏ท Average rate of change (ARC) over the interval [๐ฅ1 , ๐ฅ2 ] → Slope of the line segment ๏ท Instantaneous rate of change (IRC) at the point x → Slope of tangent Example of ARC The supply of a certain product (in 1000) is given by the following function ๐ฆ = ๐(๐ฅ) = 6๐ฅ + ๐ฅ 2 Calculate the ARC over the interval [5,10]. ๐ด๐ ๐ถ = ๐(10)−๐(5) 10−5 = OR ๐ด๐ ๐ถ = ๐(5)−๐(10) 5−10 = where ๐(10) = 6(10) + 102 = 160 and ๐(5) = 6(5) + 52 = 5 Copyright Reserved 99 2.5 The derivative of a function The derivative of a function can be found by differentiation. Example of IRC If ๐(๐ฅ) = ๐ฅ 2 then ๐(๐ฅ + โ) − ๐(๐ฅ) โ→0 โ (๐ฅ + โ)2 − ๐ฅ 2 = lim โ→0 โ ๐ฅ 2 + 2๐ฅโ + โ2 − ๐ฅ 2 = lim โ→0 โ 2 2๐ฅโ + โ = lim โ→0 โ โ(2๐ฅ + โ) = lim โ→0 โ = lim 2๐ฅ + โ ๐ ′ (๐ฅ) = lim โ→0 = 2๐ฅ + 0 ∴ ๐ ′ (๐ฅ) = 2๐ฅ Notation: ๏ท ๐ ′ (๐ฅ) ๏ท ๏ท ๐๐ฆ ๐๐ฅ ๐ ๐๐ฅ ๐(๐ฅ) 2.6 Rules of differentiation Rule 1 If ๐(๐ฅ) = ๐, where k is a constant, then ๐ ′ (๐ฅ) = 0. 1. ๐(๐ฅ) = 5 then ๐ ′ (๐ฅ) = 2. ๐(๐ฅ) = ๐ฅ 0 = then ๐ ′ (๐ฅ) = Rule 2 If ๐(๐ฅ) = ๐ฅ ๐ , where n is a real number and ๐ ≠ 0, then ๐ ′ (๐ฅ) = ๐๐ฅ ๐−1 . 1. ๐(๐ฅ) = ๐ฅ 6 then ๐ ′ (๐ฅ) = 2. ๐(๐ฅ) = ๐ฅ then ๐′ (๐ฅ) = 1 3. ๐(๐ฅ) = ๐ฅ = 3 4. ๐ฆ = √๐ฅ 2 = 5. ๐(๐ฅ) = 1 √๐ฅ = then ๐ ′ (๐ฅ) = then ๐๐ฆ ๐๐ฅ = then ๐ ′ (๐ฅ) = Copyright Reserved 100 Rule 3 If ๐(๐ฅ) = ๐๐(๐ฅ), where k is a constant, then ๐ ′ (๐ฅ) = ๐๐′ (๐ฅ). 1. ๐(๐ฅ) = 2๐ฅ 2 then ๐ ′ (๐ฅ) = 1 then ๐ ′ (๐ฅ) = 2. ๐(๐ฅ) = 2๐ฅ = ๐ 3. โ(๐ฅ) = −7๐ฅ 2 ๐ก then โ′ (๐ฅ) = Rule 4 If ๐(๐ฅ) = ๐(๐ฅ) + โ(๐ฅ) then ๐ ′ (๐ฅ) = ๐′ (๐ฅ) + โ′ (๐ฅ). If ๐(๐ฅ) = ๐(๐ฅ) − โ(๐ฅ) then ๐ ′ (๐ฅ) = ๐′ (๐ฅ) − โ′ (๐ฅ). 1. ๐(๐ฅ) = 3 + ๐ฅ 2 then ๐ ′ (๐ฅ) = 1 ๐๐ 2. ๐ = √2๐ก − ๐ก + 2๐ . Calculate ๐๐ and First we calculate ๐๐ ๐๐ In order to calculate ๐ = √2๐ก 1⁄ 2 ๐๐ ๐๐ก . = 0 − 0 + 2 = 2. ๐๐ ๐๐ก we first rewrite r as: − ๐ก −1 + 2๐ then ๐๐ ๐๐ก = Example of Rules 1 to 4: The price of a product (in Rand) depends on the quantity of the product sold and is given by ๐ = 350 − 0.08๐ − 0.002๐ 2 (a) Calculate the sales price if 40 items are sold. (b) Calculate the marginal income of the 40 items. Copyright Reserved 101 Answers: 1. For ๐ = 40 we find ๐ = 350 − 0.08(40) − 0.002(40)2 = 2. The income function is ๐ผ(๐) = ๐๐ = (350 − 0.08๐ − 0.002๐ 2 )๐ = 350๐ − 0.08๐ 2 − 0.002๐ 3 The marginal income function ๐ผ ′ (๐) = The marginal income for the 40 items is ๐ผ ′ (40) = Rule 5 (Product Rule) If ๐(๐ฅ) = ๐(๐ฅ) โ โ(๐ฅ), then ๐ ′ (๐ฅ) = โ(๐ฅ) โ ๐′ (๐ฅ) + ๐(๐ฅ) โ โ′ (๐ฅ). 1. ๐(๐ฅ) = ๐ฅ 2 (๐ฅ − 6) then ๐ ′ (๐ฅ) = 2. ๐(๐ฅ) = (๐ฅ 2 + 3๐ฅ − 4)(5๐ฅ 3 + 2๐ฅ) then ๐′ (๐ฅ) = 3. โ(๐ฅ) = 5 √ 1 (๐ฅ 2 + ๐ฅ 2 )= ๐ฅ then โ′ (๐ฅ) = Copyright Reserved 102 Rule 6 (Quotient Rule) If ๐(๐ฅ) = ๐(๐ฅ) โ(๐ฅ) , then it follows that ๐ ′ (๐ฅ) = โ(๐ฅ)๐′ (๐ฅ)−๐(๐ฅ)โ′ (๐ฅ) . [โ(๐ฅ)]2 3๐ฅ 1. ๐(๐ฅ) = ๐ฅ 2 +1 then ๐ ′ (๐ฅ) = 6๐ฅ 2 −1 2. ๐(๐ฅ) = ๐ฅ 4 +5๐ฅ+1 then ๐ ′ (๐ฅ) = 1 ๐๐ ๐๐ 3. Let ๐ = ๐ฅ 3 . We could calculate ๐๐ฅ by rewriting r as ๐ = ๐ฅ −3 and we get ๐๐ฅ = (−3)๐ฅ −4 = This same question could also be answered using the quotient rule: ๐๐ ๐๐ฅ −3 ๐ฅ4 . = Example of the quotient rule 500๐ฅ The profit of tea produced is given by ๐(๐ฅ) = ๐ฅ+20 − 2๐ฅ with x the amount in 100 kg and ๐(๐ฅ) the profit in R1000. 1. Calculate the profit if 1500 kg of tea is produced x = 15 (Note the unit!!) ๐(15) = 500(15) 15+20 − 2(15) = 184.28571 × 1000 (Note the unit!!) = ๐ 184 285.71 2. Calculate the marginal profit function ๐′ (๐ฅ) = (๐ฅ+20)×(500)−(500๐ฅ)×(1) (๐ฅ+20)2 −2= 500๐ฅ+10 000−500๐ฅ (๐ฅ+20)2 10 000 − 2 = (๐ฅ+20)2 − 2 3. Calculate the marginal profit if: (a) 1500 kg of tea is produced x = 15 10 000 ๐′ (15) = (15+20)2 − 2 = 6.16327 × 1000 = ๐ 6 163.27 (b) 15000 kg of tea is produced x = 150 10 000 ๐′ (150) = (150+20)2 − 2 = −1.65398 × 1000 = −๐ 1 653.98 Copyright Reserved 103 Rule 7 (Chain Rule) ๐๐ฆ If ๐ฆ = ๐{๐(๐ฅ)} then ๐๐ฅ = ๐ ′ {๐(๐ฅ)} โ ๐′ (๐ฅ). 1. ๐(๐ฅ) = (4๐ฅ 2 − 5๐ฅ + 6)3 then ๐ ′ (๐ฅ) = 1 2. ๐ฆ = (2๐ฅ + 5)5 (3๐ฅ 2 + 7)2 then ๐๐ฆ ๐๐ฅ = 1 ๐๐ ๐๐ 3. ๐ = ๐ฅ 4 +๐ฅ 2 +1. In this form we can find ๐๐ฅ using the quotient rule. If we rewrite k we can find ๐๐ฅ using the chain rule. 1 Re-writing k we find, ๐ = ๐ฅ 4 +๐ฅ 2 +1 = (๐ฅ 4 + ๐ฅ 2 + 1)−1 ๐๐ ๐๐ฅ = 4. ๐ฆ = √๐ฅ 3 + 5 = ๐๐ฆ ๐๐ฅ = Copyright Reserved 104 Example 1 Calculate the equation of the tangent line of ๐(๐ฅ) = ๐ฅ−1 at ๐ฅ = 3. Answer The equation of a line can be obtained using the equation ๐ฆ = ๐ฆ1 + ๐(๐ฅ − ๐ฅ1 ). Firstly, we need one co-ordinate (๐ฅ1 , ๐ฆ1 ). 1 1 Clearly ๐ฅ1 = 3. To calculate ๐ฆ1 we use ๐ฆ1 = ๐(๐ฅ1 ) = ๐(3) = 3−1 = 2. Therefore, (๐ฅ1 , ๐ฆ1 ) = (3, 0.5). To obtain the value of b, we need the derivative: 1 ๐(๐ฅ) = ๐ฅ−1 = (๐ฅ − 1)−1 ๐ ′ (๐ฅ) = ๐ ′ (3) = ๐ = Now will substitute in the values of ๐ฅ1 , ๐ฆ1 and b into the slope-point formula: ๐ฆ = ๐ฆ1 + ๐(๐ฅ − ๐ฅ1 ) Note: ๏ท ๏ท ๏ท If the equation of the tangent line is asked, the answer is ๐ฆ = 1.25 − 0.25๐ฅ. If only the intercept is asked, the answer is 1.25. If only the slope is asked, the answer is -0.25. Copyright Reserved 105 2.7 Inverse functions and their derivatives Consider the following function ๐ฆ = 3๐ฅ + 2 The inverse function of ๐ฆ = ๐(๐ฅ) is: ๐ฆ = 3๐ฅ + 2 3๐ฅ = ๐ฆ − 2 1 2 ๐ฅ = 3 ๐ฆ − 3 (Inverse function) ๐ฅ = ๐(๐ฆ) = ๐ −1 (๐ฆ) (Inverse function) Note: ๐๐ฆ ๐๐ฅ = ๐๐ฅ ๐๐ฆ = Rule 8 ๐๐ฆ ๐๐ฅ - = 1 or ๐๐ฅ ๐๐ฆ ๐๐ฅ ๐๐ฆ = 1 ๐๐ฆ ๐๐ฅ NOTE: Rule 8 only holds for one to one functions. ๐ฆ = ๐ฅ 2 + 5 wouldn’t work Graph 30 25 20 y 15 10 -4 -2 5 0 2 4 x Important: Although one can use Rule 8 to find the derivative of an inverse function; there is an easier way to find the derivative of an inverse function. If you want to find ๐๐ฅ ๐๐ฆ then you can rewrite the equation so that x is on the left hand side and the other variables and constants are on the right hand side. For example, ๐ฆ = (๐ฅ − 2)3 ๐ฆ = (๐ฅ − 2)3 ๐๐ฆ = 3(๐ฅ − 2)2 (1) = 3(๐ฅ − 2)2 ๐๐ฅ ๐ฅ − 2 = ๐ฆ3 1 1 ๐ฅ = ๐ฆ 3 + 2 (inverse function) ๐๐ฅ 1 − 2 = ๐ฆ 3 ๐๐ฆ 3 Copyright Reserved 106 2.8 The derivatives of special functions Not included 2.9 Higher derivatives Calculate the fourth order derivative of ๐(๐ฅ) = ๐ฅ 4 − 3๐ฅ 3 + ๐ฅ 2 + 7๐ฅ − 19 ๐ ′ (๐ฅ) = 4๐ฅ 3 − 9๐ฅ 2 + 2๐ฅ + 7 ๐ ′′ (๐ฅ) = 12๐ฅ 2 − 18๐ฅ + 2 ๐ ′′′ (๐ฅ) = 24๐ฅ − 18 ๐ ๐๐ฃ (๐ฅ) = 24 ๐ ๐ฃ (๐ฅ) = 0 Calculate the second order derivative of ๐(๐ฅ) = ๐ 3๐ฅ 2 −7 : ๐ ′ (๐ฅ) = ๐ ′′ (๐ฅ) = Copyright Reserved 107 2.10 Optimization problems Need to find max/min values Figure 2.10.1 ๏ท Absolute max: A ๏ท All max / min values are extreme values ๏ท Critical values x –values that might indicate extreme values ๏ท Consider ๐ ′ (๐ฅ) = 0. How do we test if k will lead to a relative min or relative max Relative max: C and E Absolute min: D Relative min: B Calculate ๐ ′′ (๐ฅ): ๐ ′′ (๐ฅ) < 0 then k leads to a relative max value ๐ ′′ (๐ฅ) > 0 then k leads to a relative min value ๐ ′′ (๐ฅ) = 0 examine the values of the function at ๐ฅ = ๐ ๏ฐ inflection point Figure 2.10.3: Rel max in the point k Figure 2.10.4 Rel min in the point k Copyright Reserved 108 Example of optimization with one critical value Calculate the extreme and critical value(s) for: ๐(๐ฅ) = 16๐ฅ − ๐ฅ 2 ๐ ′ (๐ฅ) = 16 − 2๐ฅ ๐ ′′ (๐ฅ) = −2 Critical value(s): Set ๐ ′ (๐ฅ) = 0 16 − 2๐ฅ = 0 ๐ฅ=8 Therefore, ๐ฅ = 8 is a critical value. Type of extreme value: ๐ ′′ (8) = −2 < 0 ∴ ๐ฅ = 8 leads to a relative maximum. Extreme value: ๐(8) = 16(8) − (8)2 = 64 Example of optimization with two critical values Calculate the extreme and critical values for: ๐(๐ฅ) = 3๐ฅ 4 − 4๐ฅ 3 ๐ ′ (๐ฅ) = 12๐ฅ 3 − 12๐ฅ 2 ๐ ′′ (๐ฅ) = 36๐ฅ 2 − 24๐ฅ Critical values: Set ๐ ′ (๐ฅ) = 0 12๐ฅ 3 − 12๐ฅ 2 = 0 12๐ฅ 2 (๐ฅ − 1) = 0 12๐ฅ 2 = 0 or ๐ฅ − 1 = 0 Therefore, ๐ฅ = 0 and ๐ฅ = 1 are the critical values. Type of extreme values: → ๐ ′′ (1) = 36(1)2 − 24(1) = 12 > 0. ∴ ๐ฅ = 1 leads to a relative min. → ๐ ′′ (0) = 36(0)2 − 24(0) = 0. ∴ ๐ฅ = 0 leads to an inflection point. How do we know that this leads to an inflection point? We examine the values of the function at ๐ฅ = 0. x 4 ๐(๐ฅ) = 3๐ฅ − 4๐ฅ 3 -0.1 3(−0.1) − 4(−0.1)3 =0.0043 4 0 3(0) − 4(0)3 =0 4 0.1 3(0.1) − 4(0.1)3 = - 0.0037 4 Copyright Reserved 109 Graphical representation of an inflection point: Extreme value: ๐(1) = 3(1)4 − 4(1)3 = −1. Homework (work through this example on your own) The cost (in Rand) to manufacture x products: ๐ถ(๐ฅ) = 0.01๐ฅ 2 + 20๐ฅ + 1 500 The income (in Rand) if x products are sold: ๐ผ(๐ฅ) = 70๐ฅ − 0.04๐ฅ 2 How many products should be sold if we want to maximize the profit? ๐(๐ฅ) = ๐ผ(๐ฅ) − ๐ถ(๐ฅ) = 70๐ฅ − 0.04๐ฅ 2 − (0.01๐ฅ 2 + 20๐ฅ + 1 500) = −0.05๐ฅ 2 + 50๐ฅ − 1 500 ๐′ (๐ฅ) = −0.1๐ฅ + 50 ๐′′ (๐ฅ) = −0.1 Critical value(s): Set ๐′ (๐ฅ) = 0 −0.1๐ฅ + 50 = 0 Therefore, ๐ฅ = 500 is a critical value. Type of extreme value: ๐′′ (500) = −0.1 < 0 ∴ ๐ฅ = 500 leads to a relative maximum. Hence, to earn the max profit we need to sell 500 products. Calculate the maximum profit. ๐(500) = −0.05(500)2 + 50(500) − 1 500 = ๐ 11 000 Copyright Reserved 110 Example of optimization with one critical value A manufacturer produces garden chairs at a cost of ๐ 20 a chair, and his overhead cost is ๐ 3 000 a week. From previous experience he knows that he will sell 2000 − 40๐ฅ chairs a week if he charge ๐ ๐ฅ a chair. What must the price be, and how many chairs must he sell a week, to maximize his profit? Given: • Cost per chair: • Overhead cost: • Number of chairs: • Sales price: Answer: Profit per chair: Total profit ๐(๐ฅ) = ๐′ (๐ฅ) = ๐′′ (๐ฅ) = Critical value(s); Set ๐′ (๐ฅ) = 0 −80๐ฅ + 2800 = 0 ๐ฅ = 35 Therefore, ๐ฅ = 35 is a critical value. Type of extreme value: ๐′′ (35) = −80 < 0 ∴ ๐ฅ = 35 leads to a relative maximum. The profit is maximized if the chair is sold for ๐ 35. Number of chairs to be sold: 2000 − 40(35) = 600. Copyright Reserved 111 Check that you understand the meaning of the terms: ๏ท Gross profit ๏ท Nett profit Questions 1 to 3 are based on the following information: An analysis of the financial statements of a coal mine indicates that when x tons of coal are extracted per day, the income and cost (in Rands) of the mine are, respectively: ๐ผ(๐ฅ) = 1210๐ฅ − 2๐ฅ 2 and 2 ๐ถ(๐ฅ) = ๐ฅ − 2๐ฅ + 1000 The mine is taxed at a rate of 40% on its gross profit. Question 1: Determine the value of x that maximises the income. Answer 1: ๐ผ(๐ฅ) = 1210๐ฅ − 2๐ฅ 2 ๐ผ ′ (๐ฅ) = 1210 − 4๐ฅ ๐ผ ′′ (๐ฅ) = −4 Critical value(s): Set ๐ผ ′ (๐ฅ) = 0 1210 − 4๐ฅ = 0 x = 302.5 Therefore, x = 302.5 is a critical value. Type of extreme value: ๐ผ ′′ (302.5) = −4 < 0 ∴ x = 302.5 leads to a relative maximum. Question 2: Calculate the gross profit and the value of x that maximises it: Answer 2: ๐บ๐(๐ฅ) = ๐ผ(๐ฅ) − ๐ถ(๐ฅ) = 1210๐ฅ − 2๐ฅ 2 − (๐ฅ 2 − 2๐ฅ + 1000) = 1212๐ฅ − 3๐ฅ 2 − 1000 ๐บ๐′ (๐ฅ) = 1212 − 6๐ฅ ๐บ๐′′ (๐ฅ) = −6 Critical value(s): Set ๐บ๐′ (๐ฅ) = 0 1212 − 6๐ฅ = 0 ๐ฅ = 202 Therefore, ๐ฅ = 202 is a critical value. Type of extreme value: ๐บ๐′′ (202) = −6 < 0 ∴ ๐ฅ = 202 leads to a relative maximum. Copyright Reserved 112 Question 3: Calculate the nett profit and the value of x that maximises it: Answer 3: ๐๐(๐ฅ) = ๐บ๐(๐ฅ) − 0.4๐บ๐(๐ฅ) or ๐๐(๐ฅ) = 0.6๐บ๐(๐ฅ) = 1212๐ฅ − 3๐ฅ 2 − 1000 − 0.4(1212๐ฅ − 3๐ฅ 2 − 1000 ) = 727.2๐ฅ − 1.8๐ฅ 2 − 600 ๐๐′ (๐ฅ) = 727.2 − 3.6๐ฅ ๐๐′′ (๐ฅ) = −3.6 Critical value(s): Set ๐๐′ (๐ฅ) = 0 727.2 − 3.6๐ฅ = 0 ๐ฅ = 202 Therefore, ๐ฅ = 202 is a critical value. Type of extreme value: ๐๐′′ (202) = −3.6 < 0 ∴ ๐ฅ = 202 leads to a relative maximum. Copyright Reserved 113 Extra Question 1 We need to enclose a field with a fence. We have 150 meters of fencing material and a building is on one side of the field and so won’t need any fencing. Determine the dimensions of the field that will enclose the largest area. Extra Question 1 Solution: In this problem we have two functions: the first being the function that we are actually trying to optimise (this can also be referred to as a goal function) and a second function called a constraint function. Consider a sketch of the situation: In this problem we want to maximize the area of a field and we know that it will use 150๐ of fencing material. So, the area will be the function we are trying to optimise and the amount of fencing is the constraint. The two equations for these are, Maximise: ๐ด = ๐ฅ๐ฆ and Constraint: 150 = ๐ฅ + 2๐ฆ. We can rewrite ๐ด as a function of ๐ฆ only. From the constraint function it follows that ๐ฅ = 150 − 2๐ฆ, so it then follows that: ๐ด = ๐ด(๐ฆ) = (150 − 2๐ฆ)๐ฆ = 150๐ฆ − 2๐ฆ 2 . We need to find the value of ๐ฆ in the interval [0,75], such that the area function will be maximised. Note, that the interval is obtained by setting ๐ฆ = 0 (i.e. assuming the fence has no sides) and ๐ฆ = 75 (i.e. two sides and no width, also if there are two sides each must be 75๐ to use the whole 150๐). Next, we calculate ๐ด′ (๐ฆ) = 150 − 4๐ฆ , set ๐ด’(๐ฆ) = 0 and solve for ๐ฆ. It follows that 150 − 4๐ฆ = 0 ∴๐ฆ= 150 4 = 37.5. To verify that A is maximised when y=37.5, consider the second derivative of A: ๐ด′′(๐ฆ) = −4. Since ๐ด′′(37.5) = −4 < 0 it follows ๐ด has a relative maximum where ๐ฆ = 37.5. From the constraint function it follows that when ๐ฆ = 37.5, then ๐ฅ = 150 − 2(37.5) = 150 − 75 = 75. The maximum area we can obtain using the 150๐ of fencing is ๐ด = 75๐(37.5๐) = 2812.5๐2. (Try repeating the above example by rewriting ๐ด as a function of ๐ฅ only, using similar principles and see if you obtain the same answer) Copyright Reserved 114 Extra Question 2 After playing paintball with his friends and winning the match, Evert decides to shoot some paintballs in the air to celebrate his tremendous accomplishment. Suppose that the height of the paintball in the air (in meters) at any given moment in time, ๐ก, is given by the function D(๐ก) = 60 + 6๐ก − ๐ก 2 . What is the maximum height the paintball reaches and at what point in time is this achieved? Copyright Reserved 115 Chapter 3: Integration 3.2 Indefinite integrals ๐น(๐ฅ): anti derivative ∫ : integral sign ๐(๐ฅ): integrand ๐๐ฅ: operation ๐: integral constant Rule 1 1 ∫ ๐ฅ ๐ ๐๐ฅ = ๐+1 ๐ฅ ๐+1 + ๐ for ๐ ≠ 1 1. ∫ ๐ฅ 5 ๐๐ฅ = 2. ∫ ๐ฅ 2 ๐๐ฅ = ∫ ๐ฅ −2 ๐๐ฅ = 3. ∫ √๐ฅ 3 ๐๐ฅ = ∫ ๐ฅ 2 ๐๐ฅ = 4. ∫ ๐ฅ ๐๐ฅ = ∫ ๐ฅ 1 ๐๐ฅ = 1 3 Rule 2 ∫ ๐๐(๐ฅ) ๐๐ฅ = ๐ ∫ ๐(๐ฅ) ๐๐ฅ 800 ๐๐ฅ = 800 ∫ 1. ∫ 2. ∫ 8 ๐๐ฅ = ๐ฅ9 1 ๐ฅ9 ๐๐ฅ = 800 ∫ ๐ฅ −9 ๐๐ฅ = Copyright Reserved 116 Rule 6 ∫[๐(๐ฅ) + ๐(๐ฅ)] ๐๐ฅ = ∫ ๐(๐ฅ) ๐๐ฅ + ∫ ๐(๐ฅ) ๐๐ฅ ∫[๐(๐ฅ) − ๐(๐ฅ)] ๐๐ฅ = ∫ ๐(๐ฅ) ๐๐ฅ − ∫ ๐(๐ฅ) ๐๐ฅ 1. ∫(๐ฅ 2 + 2๐ฅ − 1) ๐๐ฅ = ∫ ๐ฅ 2 ๐๐ฅ + ∫ 2๐ฅ๐๐ฅ − ∫ ๐๐ฅ = = = 3.3 Definite integrals The area under a curve ๐(๐ฅ) between a and b: ๏ท Indefinite integral ∫ ๐(๐ฅ) ๐๐ฅ = ๐น(๐ฅ) + ๐ ๏ท Definite integral ๐ ∫๐ ๐(๐ฅ) ๐๐ฅ = [๐น(๐ฅ)]๐๐ = ๐น(๐) − ๐น(๐) Copyright Reserved 117 Property 1: The interchanging of the limits of integration changes the sign of the definite integral. ๐ ๐ ∫ ๐(๐ฅ)๐๐ฅ = − ∫ ๐(๐ฅ)๐๐ฅ ๐ ๐ 5 ∫1 ๐ฅ 2 ๐๐ฅ = 1 ∫5 ๐ฅ 2 ๐๐ฅ = ๐ฅ3 1 | = 13 3 5 3 − 53 3 =− 124 3 Property 2: A definite integral has a value of zero when the two limits are identical. ๐ ∫ ๐(๐ฅ)๐๐ฅ = 0 ๐ 3 ∫ ๐ฅ 3 ๐๐ฅ = 3 Property 3: ๐ ๐ ∫ −๐(๐ฅ) ๐๐ฅ = − ∫ ๐(๐ฅ) ๐๐ฅ ๐ ๐ฅ4 3 3 ๐ 34 14 ∫1 −๐ฅ 3 ๐๐ฅ = − 4 | = (− 4 ) − (− 4 ) = (−20.25) − (−0.25) = −20 1 3 ๐ฅ4 3 34 14 − ∫1 ๐ฅ 3 ๐๐ฅ = − [ | ] = − [ − ] = −20 4 4 4 1 Copyright Reserved 118 Property 4: ๐ ๐ ∫ ๐๐(๐ฅ) ๐๐ฅ = ๐ ∫ ๐(๐ฅ) ๐๐ฅ ๐ 3 ∫1 4๐ฅ 3 ๐๐ฅ ๐ฅ4 ๐ 3 = 4 | = ๐ฅ 4 |13 = 34 − 14 = 80 4 1 ๐ฅ4 3 3 34 14 4 ∫1 ๐ฅ 3 ๐๐ฅ = 4 [ | ] = 4 [ − ] = 4 × 20 = 80 4 4 4 1 Property 5: ๐ ๐ ๐ ∫ [๐(๐ฅ) ± ๐(๐ฅ)] ๐๐ฅ = ∫ ๐(๐ฅ) ๐๐ฅ ± ∫ ๐(๐ฅ) ๐๐ฅ ๐ ๐ ๐ 3 ∫1 (๐ฅ 3 + 1) ๐๐ฅ 3 3 = ∫1 ๐ฅ 3 ๐๐ฅ + ∫1 ๐๐ฅ = ๐ฅ4 3 | + ๐ฅ|13 4 1 34 =( 4 14 − ) + (3 − 1) 4 = 20 + 2 = 22 Copyright Reserved 119 Property 6: ๐ ๐ ๐ ๐ ∫๐ ๐(๐ฅ) ๐๐ฅ = ∫๐ ๐(๐ฅ) ๐๐ฅ + ∫๐ ๐(๐ฅ) ๐๐ฅ + ∫๐ ๐(๐ฅ) ๐๐ฅ 3 ∫1 ๐ฅ 3 ๐๐ฅ = 5 ∫3 ๐ฅ 3 ๐๐ฅ 5 = ∫1 ๐ฅ 3 ๐๐ฅ = ๐ฅ4 3 | = 34 4 1 4 5 54 ๐ฅ4 | = 4 3 4 5 54 ๐ฅ4 | = 4 1 5 4 − − − 14 4 34 4 14 4 (๐ < ๐ < ๐ < ๐) = 20 = 136 = 156 3 5 Therefore, ∫1 ๐(๐ฅ) ๐๐ฅ = ∫1 ๐(๐ฅ) ๐๐ฅ + ∫3 ๐(๐ฅ) ๐๐ฅ Copyright Reserved 120 Example Calculate the area between the function ๐(๐ฅ) = ๐ฅ 3 , the x-axis, the line ๐ฅ = −3 and the line ๐ฅ = 5. Incorrect method: 5 ∫−3 ๐ฅ 3 ๐๐ฅ = ๐ฅ4 5 | 4 −3 = (5)4 4 − (−3)4 4 = 156.25 − 20.25 = 136 Correct method: Hint: Use Property number 6 0 5 ๐ฅ4 0 ๐ฅ4 −3 4 0 |∫−3 ๐ฅ 3 ๐๐ฅ| + ∫0 ๐ฅ 3 ๐๐ฅ = | | | + 4 5 | = |−20.25| + 156.25 = 176.5 0 Note: We took the absolute value of the first term (∫−3 ๐ฅ 3 ๐๐ฅ), since this area is below the x-axis. Copyright Reserved 121 Example Calculate the area of the region, which is bounded by the function ๐(๐ฅ), the x-axis, the line ๐ฅ = −3 and the line ๐ฅ = 4.5. Answer −2 0.5 4 4.5 |∫ ๐(๐ฅ)๐๐ฅ | + ∫ ๐(๐ฅ)๐๐ฅ + |∫ ๐(๐ฅ)๐๐ฅ| + ∫ ๐(๐ฅ)๐๐ฅ −3 −2 0.5 4 or −2 0.5 4 4.5 − ∫ ๐(๐ฅ)๐๐ฅ + ∫ ๐(๐ฅ)๐๐ฅ − ∫ ๐(๐ฅ)๐๐ฅ + ∫ ๐(๐ฅ)๐๐ฅ −3 −2 0.5 4 Copyright Reserved 122 3.4 Some economic applications of integrals Definite integrals Example: The demand and supply of light bulbs (in 1000): ๐ท(๐) = 16 − ๐2 ๐(๐) = 4๐ + ๐2 p = price in Rand Question: Calculate the consumers’ surplus and producers’ surplus when the market is in equilibrium. Equilibrium price: ๐ท(๐) = ๐(๐) 16 − ๐2 = 4๐ + ๐2 Therefore, the equilibrium price is ….. Copyright Reserved 123 Consumers’ surplus: Set ๐ท(๐) = 0: 16 − ๐2 = 0 To obtain the consumers’ surplus, we integrate over the demand function 4 ∫ ๐ท(๐) ๐๐ 2 where R2 is the equilibrium price and R4 is found by setting ๐ท(๐) = 0. 4 ∫2 ๐ท(๐) ๐๐ = 4 ∫2 (16 − ๐ 2 )๐๐ = [16๐ − ๐3 3 4 ]| = 13. 3ฬ × 1 000 = ๐ 13 333.33 2 Producers’ surplus: Set ๐(๐) = 0: 4๐ + ๐2 = 0 To obtain the producers’ surplus, we integrate over the supply function 2 ∫ ๐(๐)๐๐ 0 where R0 is found by setting ๐(๐) = 0 and R2 is the equilibrium price. 2 ∫0 ๐(๐)๐๐ = 2 ∫0 (4๐ +๐ 2 )๐๐ 2 = [2๐ + ๐3 3 2 ]| = 10. 6ฬ × 1 000 = ๐ 10 666.67 0 Copyright Reserved 124 More examples (work through these example on your own) Example 1: Given: • Marginal cost function (in R100) for the production of q units: ๐ถ ′ (๐) = ๐ 2 − 2๐ + 10 • Fixed cost is R2 500. Question 1: Calculate the total cost function: Answer 1: 1 ๐ถ(๐) = ∫ ๐ถ ′ (๐)๐๐ = ∫(๐ 2 − 2๐ + 10) ๐๐ = ๐ 3 − ๐ 2 + 10๐ + ๐ 3 But ๐ถ(0) = 25 Therefore, 1 ๐ถ(0) = (0)3 − (0)2 + 10(0) + ๐ = 25 3 ๐ = 25 Therefore, the total cost function is given by: 1 ๐ถ(๐) = ๐ 3 − ๐ 2 + 10๐ + 25 3 Copyright Reserved 125 Question 2: Calculate the change in cost when production increases from 5 to 10 units. Answer 2: 10 ∫5 ๐ถ ′ (๐) ๐๐ = ๐ถ(๐)|10 5 = ๐ถ(10) − ๐ถ(5). From this it can be seen that we have to 10 calculate the definite integral ∫5 ๐ถ ′ (๐) ๐๐. 10 ∫5 ๐ถ ′ (๐) ๐๐ 10 = ∫5 (๐ 2 − 2๐ + 10) ๐๐ 1 10 3 5 = ( ๐ 3 − ๐ 2 + 10๐)| = 333. 3ฬ − 66. 6ฬ = 266. 6ฬ × 100 = ๐ 26 666.67 Copyright Reserved 126 Example 2: Given: The marginal income from the sale of the ๐ ๐กโ book is given by ๐ผ ′ (๐) = 45 − 0.21√๐ − 0.01๐ with 0 ≤ ๐ ≤ 1000 Question: Calculate the additional income earned when sales increase from 400 to 900 books. Answer: 900 ∫400 ๐ผ ′ (๐)๐๐ 900 = ∫400 (45 − 0.21√๐ − 0.01๐)๐๐ 3 = [45๐ − 0.14๐ 2 − 0.005๐2 ] 900 400 3 3 = [45(900) − 0.14(900)2 − 0.005(900)2 ] − [45(400) − 0.14(400)2 − 0.005(400)2 ] = 32 670 − 16 080 = 16 590. Copyright Reserved 127 Example 3: Given: A sales representative sells motor polish. When q bottles of polish are sold, the marginal income of the ๐ ๐กโ bottle will be equal to ๐ผ ′ (๐) = 34 − 0.06๐ − 0.0003๐2 with 0 ≤ ๐ ≤ 400. Motor polish cost R10 per bottle and the sales representative must pay a once-off registration fee of R50. Question 1: Calculate the total cost and total income functions when q bottles of polish are sold. Answer 1: Income function: ๐ผ(๐) = ∫ ๐ผ ′ (๐)๐๐ = ∫(34 − 0.06๐ − 0.0003๐2 ) ๐๐ = 34๐ − 0.03๐2 − 0.0001๐ 3 + ๐ But ๐ผ(0) = 0 (when 0 bottles of polish are sold, the income will equal R0) ๐ผ(0) = 34(0) − 0.03(0)2 − 0.0001(0)3 + ๐ = 0 ๐=0 Therefore, the income function is given by ๐ผ(๐) = 34๐ − 0.03๐ 2 − 0.0001๐3 Cost function: ๐ถ(๐) = 10๐ + 50 Copyright Reserved 128 Question 2: Calculate the value of q which will maximise profit. Answer 2: The profit function is: ๐(๐) = ๐ผ(๐) − ๐ถ(๐) = 34๐ − 0.03๐2 − 0.0001๐ 3 − (10๐ + 50) = −0.0001๐3 − 0.03๐2 + 24๐ − 50 To obtain the critical value(s), set ๐′ (๐) = 0: ๐′ (๐) = −0.0003๐ 2 − 0.06๐ + 24 = 0 0.06 ± √(−0.06)2 − 4(−0.0003)(24) 0.06 ± √0.0324 ๐= = 2(−0.0003) −0.0006 q = 200 and q = - 400 ๏ฐ economical unacceptable. Therefore, the critical value is q = 200. ๐′′ (๐) = −0.0006๐ − 0.06 ๐′′ (200) = −0.0006(200) − 0.06 = −0.18 < 0 ∴ q = 200 leads to a relative max. Question 3: Calculate the maximum profit. Answer 3: ๐(200) = −0.0001(200)3 − 0.03(200)2 + 24(200) − 50 = 2 750 in Rand. Copyright Reserved 129 3.5 Statistical applications of integrals Calculating probabilities Let X be a continuous random variable with p.d.f. given by f(x). From Section 1 we know that f(x) is a valid probability density function when: 1. ๐(๐ฅ) ≥ 0 for all ๐ฅ and ∞ 2. ∫−∞ ๐(๐ฅ)๐๐ฅ = 1 (i.e. the area below the entire function equals exactly 1). We can use integration to determine whether a function is a valid p.d.f. or not. Example: Let ๐ฅ , 0≤๐ฅ≤1 2 1 , 1<๐ฅ<2 ๐(๐ฅ) = 2 ๐ฅ 3 − + ,2 ≤ ๐ฅ ≤ 3 2 2 { 0, ๐๐๐ ๐๐คโ๐๐๐ From the definition of the function it can be seen that ๐(๐ฅ) ≥ 0 for all ๐ฅ (verify that this is true). It follows that ∞ 3 ∫−∞ ๐(๐ฅ)๐๐ฅ = ∫0 ๐(๐ฅ)๐๐ฅ (since the function is 0 when ๐ฅ is not in [0,3]). 1๐ฅ 21 3 ๐ฅ 3 = ∫0 2 ๐๐ฅ + ∫1 2 ๐๐ฅ + ∫2 (− 2 + 2) ๐๐ฅ (Rule 6) Copyright Reserved 130 Question: ๐ฅ2 , −4(๐ฅ − 2) ๐(๐ฅ) = { , 3 0 , ๏ท ๏ท ๏ท 0≤๐ฅ≤1 1<๐ฅ≤2 ๐๐๐ ๐๐คโ๐๐๐ Draw a graph of ๐ Show that ๐ is a valid p.d.f. Calculate ๐(๐ > 0.5) ๐ Hint: ๐(๐ < ๐ < ๐) = ๐(๐ ≤ ๐ ≤ ๐) = ∫๐ ๐(๐ฅ)๐๐ฅ for ๐ < ๐. Challenging Question: Find C, such that ๐ถ๐ฅ 2 + ๐ถ, ๐(๐ฅ) = { 0 , −2 ≤ ๐ฅ ≤ 2 ๐๐๐ ๐๐คโ๐๐๐ is a valid p.d.f. Copyright Reserved 131 Expected values If ๐ is a continuous random variable from a distribution with a p.d.f. given by ๐(๐ฅ) then ∞ ๐ธ[๐] = ∫−∞ ๐ฅ๐(๐ฅ)๐๐ฅ. Example: Let ๐ be a continuous random variable with p.d.f. given by 0.375๐ฅ 2 , 0 ≤ ๐ฅ ≤ 2 โ(๐ฅ) = { . 0, ๐๐๐ ๐๐คโ๐๐๐ Find ๐ธ[๐]: ∞ ๐ธ[๐] = ∫ ๐ฅโ(๐ฅ)๐๐ฅ −∞ 2 = ∫ ๐ฅ(0.375๐ฅ 2 )๐๐ฅ 0 2 = ∫ 0.375๐ฅ 3 ๐๐ฅ 0 0.375 4 2 = ๐ฅ | 4 0 0.375(24 ) = −0 4 3 = = 1.5 2 We can calculate the expected value of a random function too, i.e. a function with respect to the continuous random variable X, say k(X). The expected value of k(X) is given by: ∞ ๐ธ[๐(๐)] = ∫ ๐(๐ฅ)๐(๐ฅ)๐๐ฅ −∞ Example: Let ๐ have the same p.d.f. as in the previous example. To calculate ๐๐๐(๐) we make use of the fact that ๐๐๐(๐) = ๐ธ(๐ 2 ) − [๐ธ(๐)]2 . It follows that: 2 2 ๐ธ(๐ 2) =∫ ๐ฅ 0 2 (0.375๐ฅ 2 )๐๐ฅ 0.375๐ฅ 5 = | = 2.4 5 0 and ๐๐๐(๐) = ๐ธ(๐ 2 ) − [๐ธ(๐)]2 = 2.4 − 1.52 = 0.15. Theoretical Example: In WST 133 we showed that when ๐ is a discrete random variable with probability function ๐(๐ฅ) and we let ๐ = ๐๐ ± ๐ then ๐ธ[๐] = ๐๐ธ[๐] ± ๐ and ๐๐๐(๐) = ๐2 ๐๐๐(๐) (where ๐ and ๐ are constants). This result is also true for continuous random variables. Copyright Reserved 132 Proof: Let ๐ be a continuous random variable with p.d.f. f(x). When ๐ = ๐๐ ± ๐ it follows that ∞ ๐ธ[๐] = ∫ (๐๐ฅ ± ๐)๐(๐ฅ)๐๐ฅ −∞ ∞ ∞ = ∫ ๐๐ฅ๐(๐ฅ)๐๐ฅ ± ∫ ๐๐(๐ฅ)๐๐ฅ −∞ ∞ −∞ ∞ = ๐ ∫ ๐ฅ๐(๐ฅ)๐๐ฅ ± ๐ ∫ ๐(๐ฅ)๐๐ฅ −∞ (Property 5) (Property 4) −∞ = ๐๐ธ[๐] ± ๐(1) (Definition of an expected value and entire area of p.d.f is 1) To show that ๐๐๐(๐) = ๐2 ๐๐๐(๐) use the fact that ๐๐๐(๐) = ๐ธ[(๐ − ๐ธ[๐])2 ]. Exercise: Let ๐ follow a continuos uniform distribution with parameters ๐ and ๐, with ๐ < ๐. Prove that: 1. ๐ธ[๐] = ๐+๐ 2 2. ๐๐๐[๐] = and (๐−๐)2 12 . Hint: Recall that ๐๐๐(๐) = ๐ธ(๐ 2 ) − [๐ธ(๐)]2 . Moment Generating Functions Definition: Suppose that ๐ is a discrete random variable with probability mass function given by ๐๐ (๐ฅ) = ๐(๐ = ๐ฅ). The Moment Generating Function (MGF) of ๐ is defined as ๐๐ (๐ก) = ๐ธ[๐ ๐ก๐ ] = ∑ ๐ ๐ก๐ ๐๐ (๐ฅ). ∀๐ฅ Provided that the ๐-th derivative of ๐๐ (๐ก) exists at the point ๐ก = 0, it follows that (๐) ๐๐ (0) = ๐ธ[๐ ๐ ] The discrete uniform case: Example 1 Let the random variable ๐ be the outcome of rolling a 6-sided die. Find the MGF for ๐. Solution: The mass function for ๐ is given by 1 ๐(๐ฅ) = {6 , 0, ๐ฅ = 1, 2, 3, 4, 5, 6 ๐๐๐ ๐๐คโ๐๐๐ It follows that: ๐๐ (๐ก) = ๐ธ[๐ ๐ก๐ ] 6 1 = ∑ ๐ ๐ก๐ฅ ( ) 6 ๐ฅ=1 1 = (๐ ๐ก + ๐ 2๐ก + โฏ + ๐ 6๐ก ) 6 Copyright Reserved 133 Example 2 Let the random variable ๐ be the outcome of rolling a 6-sided die. Find ๐ธ(๐) and ๐ฃ๐๐(๐) using the MGF of ๐. Solution: We know that ๐ธ(๐) = ๐๐′ (0) ๐ธ(๐ 2 ) = ๐๐′′ (0) and 2 ๐ฃ๐๐(๐) = ๐ธ(๐ 2 ) − (๐ธ(๐)) It follows that ๐๐′ (๐ก) = = ๐ ๐ (๐ก) ๐๐ก ๐ ๐ 1 ๐ก ( (๐ + ๐2๐ก + ๐3๐ก + ๐4๐ก + ๐5๐ก + ๐6๐ก )) ๐๐ก 6 1 = (๐๐ก + 2๐2๐ก + 3๐3๐ก + 4๐4๐ก + 5๐5๐ก + 6๐6๐ก ) 6 In addition, ๐ ′ ๐ (๐ก) ๐๐ก ๐ ๐ 1 = ( (๐๐ก + 2๐2๐ก + 3๐3๐ก + 4๐4๐ก + 5๐5๐ก + 6๐6๐ก )) ๐๐ก 6 1 = (๐๐ก + 22 ๐2๐ก + 32 ๐3๐ก + 42 ๐4๐ก + 52 ๐5๐ก + 62 ๐6๐ก ) 6 ๐๐′′ (๐ก) = Therefore, ๐ธ(๐) = ๐๐′ (0) 1 = (๐0 + 2๐2(0) + 3๐3(0) + 4๐4(0) + 5๐5(0) + 6๐6(0) ) 6 1 = (1 + 2 + 3 + 4 + 5 + 6) 6 = 3.5 and which leads to ๐ธ(๐ 2 ) = ๐๐′′ (0) 1 = (๐0 + 22 ๐2(0) + 32 ๐3(0) + 42 ๐4(0) + 52 ๐5(0) + 62 ๐6(0) ) 6 1 = (12 + 22 + 32 + 42 + 52 + 62 ) 6 91 = 6 2 ๐ฃ๐๐(๐) = ๐ธ(๐ 2 ) − (๐ธ(๐)) 91 21 2 = −( ) 6 6 105 = 36 ≈ 2.91666667 Copyright Reserved 134 The Binomial case: In order the derive the expected value and variance of the ๐ where ๐~๐ต๐๐(๐, ๐) we require the help of the binomial theorem. Definition: Binomial theorem For any positive integer ๐, it follows that ๐ ๐ (๐ฅ + ๐ฆ) = ∑ ( ) ๐ฅ ๐ก ๐ฆ ๐−๐ก ๐ก ๐ ๐ก=0 The MGF of ๐ can be derived as follows. ๐๐ (๐ก) = ๐ธ[๐ ๐ก๐ ] ๐ = ∑ ๐ ๐ก๐ฅ ๐(๐ฅ) ๐ฅ=0 ๐ ๐ = ∑ ๐ ๐ก๐ฅ ( ) ๐ ๐ฅ (1 − ๐)๐−๐ฅ ๐ฅ ๐ฅ=0 ๐ ๐ = ∑ ( ) (๐๐ ๐ก )๐ฅ (1 − ๐)๐−๐ฅ ๐ฅ ๐ฅ=0 = [๐๐ ๐ก + (1 − ๐)]๐ โฏ ๐ข๐ ๐๐๐ ๐กโ๐ ๐๐๐๐๐๐๐๐ ๐กโ๐๐๐๐๐ It follows that ๐ ๐ (๐ก) ๐๐ก ๐ ๐ = [๐๐ ๐ก + (1 − ๐)]๐ ๐๐ก ๐−1 = ๐(๐๐ ๐ก + (1 − ๐)) × ๐๐ ๐ก ๐๐′ (๐ก) = โฏ ๐ข๐ ๐๐๐ ๐กโ๐ ๐โ๐๐๐ ๐๐ข๐๐ Therefore, ๐ธ(๐) = ๐๐′ (0) = ๐(๐๐ 0 + (1 − ๐)) ๐−1 × ๐๐ 0 ๐−1 = ๐(๐ + (1 − ๐)) ×๐ ๐−1 = ๐(1) ×๐ = ๐๐ ๐ ๐๐๐๐ 1๐ = 1 ∀ ๐ Exercise: Use the MGF of ๐ to prove that ๐ฃ๐๐(๐) = ๐๐(1 − ๐). Hints: 1. Find ๐๐′′ (๐ก) using both the product rule and the chain rule. 2. Find the value of ๐ธ(๐ 2 ) = ๐๐′′ (0). 2 3. Recall that ๐ฃ๐๐(๐) = ๐ธ(๐ 2 ) − (๐ธ(๐)) . Copyright Reserved 135 Definition: Suppose that ๐ is a continuous random variable with probability density function given by ๐๐ (๐ฅ). The Moment Generating Function (MGF) of ๐ is defined as ๐๐ (๐ก) = ๐ธ[๐ ๐ก๐ ] ∞ = ∫ ๐ ๐ก๐ฅ ๐๐ (๐ฅ) ๐๐ฅ −∞ Provided that the ๐-th derivative of ๐๐ (๐ก) exists at the point ๐ก = 0, it follows that (๐) ๐๐ (0) = ๐ธ[๐ ๐ ] The Continuous Uniform case: Let ๐~๐๐๐๐(๐, ๐). Then the MGF of ๐ can be derived as follows. ๐๐ (๐ก) = ๐ธ[๐ ๐ก๐ ] ∞ = ∫ ๐ ๐ก๐ฅ ๐๐ (๐ฅ) ๐๐ฅ −∞ ๐ = ∫ ๐ ๐ก๐ฅ ๐ 1 ๐๐ฅ ๐−๐ ๐ 1 = ∫ ๐ ๐ก๐ฅ ๐๐ฅ ๐−๐ ๐ 1 1 ๐ก๐ฅ ๐ = [ ๐ ] ๐−๐ ๐ก ๐ 1 1 ๐ก๐ 1 ๐ก๐ = ( ๐ − ๐ ) ๐−๐ ๐ก ๐ก ๐ ๐๐ก − ๐ ๐๐ก = ๐ก(๐ − ๐) Copyright Reserved 136 Solutions to Self Evaluation Questions Chapter 6 1. Correct Option: a. 1 ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 ๐(๐ฅ) = {40 − 25 , 0 , ๐๐๐ ๐๐คโ๐๐๐ 1 ๐คโ๐๐๐ 25 ≤ ๐ฅ ≤ 40 = {15 , 0, ๐๐๐ ๐๐คโ๐๐๐ 2. 1 ) = 0.75 15 1 How did we know ๐75 = 36.25? By taking (๐ฅ − 25) (15) = 0.75. Therefore, ๐ฅ − 25 = (0.75)(15) and consequently, ๐ฅ = 11.25 + 25 = 36.25. ๐(25 < ๐ < 36.25) = โ๐ฅ โ ๐(๐ฅ) = (36.25 − 25) ( 3. ๐๐๐(๐) = (๐−๐)2 12 = (40−25)2 12 = 18.75 1 4. ๐(๐ > 22) = โ๐ฅ โ ๐(๐ฅ) = (40 − 25) (15) = 1. 1 5. ๐(27 < ๐ < 36) = โ๐ฅ โ ๐(๐ฅ) = (36 − 27) (15) = 0.6. Copyright Reserved 137 6. The 5th percentile of the standard normal distribution. Answer = b. 7. A TYPICAL MISTAKE THAT STUDENTS MAKE ๐ Most of you will probably want to use the formula that was given in Chapter 3: ๐ = (100) ๐. This is wrong! You can only use that formula when the original raw data set is given, because the index ๐ indicates which position in the ordered original data set you need to go to. Since we did not give you the original data set; this is a dead end. The correct answer: Given: Due to symmetry we have the following graph: The value of NORM.S.INV(0.1) in Excel is -1.282. Therefore, And: The 90th percentile means that 90% of the values are to the left of that point. Therefore, ∴ ๐= ๐−๐ , ๐ ∴ 1.282 = ๐ฅ−100 , 15 ∴ ๐ฅ = (1.282)(15) + 100 = 119.23. Therefore, the 90th percentile is equal to 119.23 (๐90 = 119.23). Copyright Reserved 138 Chapter 7 60−50 40−50 1. ๐ง = 12 = 0.83ฬ and ๐ง = 12 = −0.83ฬ. ๐(40 < ๐ < 60) = ๐(−0.83ฬ < ๐ < 0.83ฬ) = ๐(๐ < 0.83ฬ) − ๐(๐ < −0.83ฬ) = 0.7967 − 0.2033 = 0.5934. 2. ๐ง= ๐ฅ−๐ ๐ ๐ฅ−50 ∴ 1.04 = 12 ∴ ๐ฅ = (1.04)(12) + 50 = 62.48. 3. ๐ฅฬ − ๐ ๐ฅฬ − ๐ = ๐ ๐๐ฬ √๐ ๐ฅ − 50 ∴ −0.67 = 12 √25 12 ∴ ๐ฅฬ = (−0.67) ( ) + 50 = 48.39 √25 ๐ง= 1 4. ๐(๐ฅ) = {50−10 0 1 = 40 for 10 ≤ ๐ฅ ≤ 50 elsewhere 1 ๐(20 < ๐ < 60) = โ๐ฅ. ๐(๐ฅ) = (60 − 50)(0) + (50 − 20) (40) = 0.75. Copyright Reserved 139 5. ๐๐๐(๐) = ๐ 2 = (๐−๐)2 12 = (50−10)2 12 = 133. 3ฬ. Therefore, ๐ ๐ก๐๐๐ฃ(๐) = ๐ = √133. 3ฬ = 11.547. ๐ 11.547 And so ๐๐ฬ = ๐ = = 1.83. √40 √ 28−30 6. ๐(๐ฬ > 28) = ๐ (๐ > 1.83 ) = ๐(๐ > −1.09) = 1 − ๐(๐ < −1.09) = 1 − 0.1379 = 0.8621 7. ๐๐2 = ๐(1−๐) ๐ = ๐(1−๐) 8. ๐๐ = √ ๐ (0.2)(0.8) 32 = 0.005 = 0.0707 and ๐(|๐ − ๐| < 0.05) = ๐(−0.05 < ๐ − ๐ < 0.05) −0.05 = ๐ (0.0707 < ๐−๐ ๐๐ ฬ 0.05 < 0.0707) = ๐(−0.71 < ๐ < 0.71) = 0.7611 − 0.2389 = 0.5222 9. ๐−๐ ๐๐ ๐ − 0.2 1.28 = 0.0707 ๐ง= ๐ = 1.28(0.0707) + 0.2 = 0.29 10. ๐(17 < ๐ < 26) = ๐(๐ ≤ 25) − ๐(๐ ≤ 17) = 0.9828 − 0.4215 = 0.5613 Note: Both ๐(๐ ≤ 25) and ๐(๐ ≤ 17) are obtained using the Excel sheets. 11. ๐(๐ > 18) = 1 − ๐(๐ ≤ 18) = 1 − 0.5689 = 0.4311 Note: ๐(๐ ≤ 18) is obtained using the Excel sheets. 12. Let ๐ be the number of customers who don’t prefer name brand clothing. ๐ธ(๐) = ๐(1 − ๐) = (30)(0.4) = 12 22 13. |๐ − ๐| = |0.73 − 0.6| = 0.13 where ๐ = 30 = 0.73ฬ. Copyright Reserved 140 14. The sampling distribution of ๐ can be approximated by a normal probability distribution whenever ๐๐ ≥ 5 and ๐(1 – ๐) ≥ 5. 15. It is given that ๐๐ = 0.0894. ๐ง= ๐−๐ ๐๐ 0.25 = ๐ − 0.6 0.0894 ๐ = (0.25)(0.0894) + 0.6 = 0.62235 Therefore, ๐60 = 0.62235. Chapter 8 ๐(1−๐) 1. ๐ − ๐ง๐ผ/2 √ ๐ (0.2)(0.8) = 0.2 − (1.96)√ 50 10 = 0.089 with ๐ = 50 = 0.2. 2. If the confidence coefficient of a confidence interval decreases from 0.95 to 0.90, the ๐ง๐ผ/2 value decreases from 1.96 to 1.645 and, consequently, the interval is narrower. Answer = b. 3. ๐ฅ + ๐ง๐ผ/2 4. ๐ฅ = ∑ ๐ฅ๐ ๐ ๐ √ = 7.5 + (2.576) ๐ = 1 517.39 40 1.05 √30 = 7.99 = 37.93 5. In cell D3 of Excel the =COUNTIF(B2:B41, B3) function counts the number of payments made using a credit card, i.e. 30 out of 40 payments were made using a credit card. Therefore, 40 – 30 = 10 payments were made using cash. 10 ๐= = 0.25 40 6. ๐ก-distribution 7. To obtain the value of ๐ก๐ผ/2 we use the T.INV.2T function of Excel =T.INV,2T(๐ผ, ๐๐) = T.INV.2T(0.05, 39) = 2.023 (this is given in cell D6 of Excel) with ๐๐ = ๐ − 1 = 40 − 1 = 39. To obtain the sample standard deviation, we take the square root of the variance (this is given in cell D4 of Excel). The margin of error is equal to ๐ก๐ผ/2 ๐ √๐ 7 = (2.023) ( √40 ) = 2.2391. Copyright Reserved 141 ๐(1−๐) 8. ๐ − ๐ง๐ผ/2 √ ๐ ๐(1−๐) 9. ๐ง๐ผ/2 √ ๐ = 0.25 − (2.576)√ (0.25)(0.75) = (1.96)√ 40 (0.25)(0.75) 40 = 0.0736 = 0.1342 10. The margin of error decreases, which implies a narrower interval. Chapter 9 1. ๐ป0 : ๐ ≥ 20 ๐ป๐ : ๐ < 20 2. ๐ฅ = ∑ ๐ฅ๐ = ๐ 216 12 ∑(๐ฅ๐ −๐ฅ)2 3. ๐ = √ ๐−1 = 18 = 2.80 4. ๐๐ = ๐ − 1 = 12 − 1 = 11 and ๐ก = −2.472. On the t-table go to the correct degrees of freedom = 11. We look for the absolute value of the test statistic = 2.472. We find that this is between 2.201 and 2.718 and, consequently, the pvalue is between 0.01 and 0.025. 5. ๐๐ = ๐ − 1 = 12 − 1 = 11 and ๐ผ = 0.01, therefore, −๐ก๐ผ = −2.718. The null hypothesis in not rejected, since t (=-2.472) > -2.718. Therefore, the average baggage weight is not significantly less than 20kg. Answer = b. 6. Type I error 7. ๐ป0 : ๐ ≥ 100 ๐ป๐ : ๐ < 100 8. ๐ง = ๐ฅ−๐0 ๐ √๐ = 90−100 25 √50 = −2.83 9. ๐ป0 can be rejected at a 0.5% level of significance, since p-value (0.0023) < ๐ผ (0.005). Answer = e. ๐(1−๐) 10. ๐๐ = √ ๐ (0.4)(0.6) =√ 80 = 0.0548 11. The area to the left of ๐ง = −2.74 is 0.0031. Therefore, p-value = (2)(0.0031) = 0.0062. 12. ๐ผ = 0.01, ∴ ๐ผ⁄2 = 0.01⁄2 = 0.005. Therefore, ๐ง๐ผ/2 = 2.576. Reject H0 if ๐ง < −2.576 or ๐ง > 2.576. Since ๐ง(= −2.74) < −2.576 the null hypothesis is rejected at a 1% level of significance. Therefore, ๐ ≠ 0.4. Answer = e. 13. The area to the left of ๐ง = −2.74 is 0.0031. Therefore, p-value = 0.0031. Copyright Reserved 142 Chapter 10 1. Correct option: a 0 2. ๐ป0 : ๐1 ≤ ๐2 ๐ป๐ : ๐1 > ๐2 3. ๐ก = ๐ฅ1 −๐ฅ2 9−7 = ๐ 2 ๐ 2 √ 1+ 2 ๐1 ๐2 √ 5.2 3.2 + 6 6 = 1.69 4. ๐๐ = 9 and ๐ผ = 0.05. Therefore, ๐ก๐ผ = 1.833. 5. ๐ป0 : ๐1 = ๐2 ๐ป : ๐ − ๐2 = 0 or 0 1 ๐ป๐ : ๐1 ≠ ๐2 ๐ป๐ : ๐1 − ๐2 ≠ 0 6. ๐ก = ๐ฅ1 −๐ฅ2 ๐ 2 ๐ 2 √ 1+ 2 ๐1 ๐2 = 9.25−6.6 2 √2.87 +1.95 4 2 = 1.58 5 7. On the t-table, go to the correct ๐๐ = 5. We find that the test statistic ๐ก = 1.58 lies between 1.476 and 2.015. Consequently, the area in the upper tail is between 0.05 and 0.1. Since we are working with a two-tailed test, we need to multiply the area in the upper tail by 2 and we obtain 0.05 < area in the upper tail < 0.1 0.1 < ๐ − value < 0.2 8. ๐ผ = 0.1, ∴ ๐ผ⁄2 = 0.1⁄2 = 0.05 and ๐๐ = 5. Therefore, ๐ก๐ผ/2 = 2.015. The null hypothesis is rejected if ๐ก ≤ −2.015 or ๐ก ≥ 2.015. 9. The null hypothesis is not rejected at a 10% level of significance, since the test statistic, ๐ก = 1.58, is not smaller than −2.015 or greater than 2.015. Therefore, ๐1 = ๐2 . Answer = b. Copyright Reserved 143 Revision Exercise – Chapter 5 According to BusinessWeek/Harris poll of 1035 adults, 40% of those surveyed agreed strongly with the proposition that business has too much power over American life (BusinessWeek, Sept 11, 2000). Assume this percentage is representative of the American population. In sample of 20 individuals taken from a cross-section of the American population. (a) What is the probability that at least five of these individuals will feel that business has too much power over American life? Use Excel to answer the following questions: (b) Calculate the cumulative probability distribution of X for 0 ๏ฃ X ๏ฃ 20 (c) What is the probability that exactly five of these individuals will feel that business has too much power over American life? (d) What is the probability that at least five of these individuals will feel that business has too much power over American life? Compare answer with a. (e) What is the probability that at most two of these individuals will feel that business has too much power over American life? (f) What is the probability that more than one of these individuals will feel that business has too much power over American life? (g) What is the probability that at least fourteen of these individuals will feel that business has too much power over American life? (h) What is the probability that less than ten of these individuals will feel that business has too much power over American life? (i) What is the probability that less than two of these individuals will feel that business has no power over American life? (j) What is the expected number of individuals that will feel that business has too much power over American life? (k) What are the variance and standard deviation of individuals that will feel that business has too much power over American life? Copyright Reserved 144 Revision Exercise – Chapter 5 – Solution (a) ๐ = the number of Americans who believe that business has too much power over American life. P๏จ X ๏ณ 5๏ฉ ๏ฝ 1 ๏ญ P๏จ X ๏ผ 5๏ฉ = 1 - f (0) - f (1) - f (2) - f (3) - f (4) = 1 - .0000 - .0005 - .0031 - .0123 - .0350 = .9491 Formula worksheet: Copyright Reserved 145 Value worksheet: Copyright Reserved 146 Additional Exercises Chapter 6 Question 1 Which one of the following is a valid discrete probability distribution? -2 -1 0 1 2 (A) x 0.2 0.1 0 0.1 0.2 f (x ) (B) x f (x ) (C) x f (x ) (D) x f (x ) (E) x f (x ) 1 0.1 2 0.2 3 0.3 4 0.4 5 0.5 2 -0.2 4 -0.1 6 0 8 0.1 10 0.2 1 -0.2 2 -0.1 3 1 4 0.1 5 0.2 -2 0.4 -1 0.1 0 0 1 0.1 2 0.4 Questions 2 and 3 are based on the following information: The time (in hours) that it takes a bus to travel from Johannesburg to Bloemfontein has the following uniform density function: ๏ฌ1 for 4๏ฃ x๏ฃ6 ๏ฏ2 f ( x) ๏ฝ ๏ญ ๏ฏ 0 elsewhere ๏ฎ Question 2 The probability that the bus takes longer than 5.5 hours to travel from Johannesburg to Bloemfontein is: (A) (C) (E) 0.25 0.50 0.75 (B) (D) 0.45 0.55 Question 3 The probability that the bus takes 5 hours to travel from Johannesburg to Bloemfontein is: (A) (C) (E) 0.1 0.2 0.5 (B) (D) 0 0.25 Copyright Reserved 147 Questions 4 and 5 are based on the following information: The probability distribution of the number of home loans that are approved weekly by the local branch office of a bank, is represented in the following Excel spreadsheet: Excel: Formula sheet Excel: Value sheet Question 4 The variance for the distribution of the number of home loans that are approved weekly is: (A) (C) (E) 2.60 9.00 0 (B) (D) 4.67 11.60 Question 5 The probability that less than 3 home loans are approved per week is: (A) (C) (E) 0.20 0.35 0.55 (B) (D) 0.25 0.45 Copyright Reserved 148 Questions 6 and 7 are based on the following information: The random variable Z is normally distributed with average 0 and standard deviation 1. Question 6 P(๏ญ1.62 ๏ผ Z ๏ผ ๏ญ0.5) ๏ฝ (A) (C) (E) 0.2559 0.4474 0.7441 (B) (D) 0.3612 0.6388 Question 7 If the area to the right of z is equal to 0.95, then z is equal to: (A) (C) (E) -1.960 -0.8289 1.645 (B) (D) -1.645 0.8289 Question 8 Consider the following probabilities of a binomial distribution: Excel: Formula sheet Excel Value sheet P( X ๏พ 10) ๏ฝ (A) (C) (E) Memo 0.8829 0.8725 0.9290 Q1 - E Q2 - A Q3 - B (B) (D) 0.8403 0.7553 Q4 - A Q5 - C Q6 - A Q7 - B Q8 - D Copyright Reserved 149 Chapter 7 Questions 1 to 4 are based on the following information: Suppose 60% of the students at university XYZ own a cell phone. For a random sample of 200 students, from this population, it was found that 130 students owned a cell phone. Let p denote the point estimator of the proportion students owning a cell phone. Question 1 The point estimate for the proportion of students who own a cell phone is: (A) (C) (E) 0.13 0.65 130 Question 2 The sampling error of (A) (C) (E) p 0.6 120 (B) (D) 0.050 0.480 (B) (D) 0.050 0.480 is: 0.035 0.240 6.928 Question 3 The standard deviation of (A) (C) (E) (B) (D) p is: 0.035 0.240 6.928 Question 4 The sampling distribution of p can be approximated by a: (A) binomial distribution whenever, (B) binomial distribution whenever, (C) binomial distribution whenever, (D) (E) normal distribution whenever, normal distribution whenever, n ๏ณ 30 np ๏ณ 5 np ๏ณ 5 and n(1 ๏ญ p) ๏ณ 5 and n ๏ณ 30 and n(1 ๏ญ p) ๏ณ 5 n ๏ณ 30 np ๏ณ 5 Memorandum: 130 ๏ฝ 0.65 200 2. Option B p ๏ญ p ๏ฝ 0.65 ๏ญ 0.6 ๏ฝ 0.05 1. Option C p๏ฝ 3. Option A ๏ณ p ๏ฝ p๏จ1 ๏ญ p ๏ฉ 0.6๏จ0.4๏ฉ ๏ฝ ๏ฝ 0.035 n 200 4. Option E (theory on page 301 in textbook) Copyright Reserved 150 Chapter 8 Question 1 The fuel consumption (in l/100km) of 10 motors that conducted a 500 km test is as follows: 8.93 7.75 7.90 8.20 8.41 8.50 8.05 7.93 8.60 8.33 x ๏ฝ 8.26 Given: s ๏ฝ 0.3645 Assume: The fuel consumption is normally distributed. The upper limit of the 99% confidence limit of the population mean ๏ญ is: (A) (C) (E) 8.528 8.587 8.635 (B) (D) 8.557 8.626 Memorandum: Question 1 x ๏ซ tα / 2 s n = 8.26 ๏ซ 3.250 0.3645 = 8.6346 10 Answer: (E) Copyright Reserved 151