Statistics 115 BASIC STATISTICAL METHODS Course Objectives The major objective of this course is simply to introduce the basic principles of Inferential Statistics. In order to enlighten the student on these basic principles, selected elementary tools in Inferential Statistics will be presented. And by the end of the semester, the student of this course must be able to achieve the following: Commit to memory the basic terms, concepts and notations used in Applied Statistics, especially those that are useful in Inferential Statistics; Know how to solve simple probability problems using the a priori and a posteriori approaches to assign probabilities; Know how to use the normal, t, 2, and F tables to compute for probabilities; Understand the concept of sampling distributions and know the sampling distributions of common statistics computed from a random sample from a normal distribution; Comprehend the basic concepts used in estimating population parameters and testing statistical hypotheses; Compute for the interval estimates of usual parameters that describe a single population or are used to compare two or more populations; Conduct tests of hypotheses on these parameters and interpret the results of these tests; Be acquainted with simple tools used to study the relationship between two variables; Perform the chi-squared test for independence; Estimate the regression coefficients of the simple linear regression model using the method of least squares; Be aware of the basic principles of experimental design; Know the description and principal advantages and disadvantages of some basic experimental designs; Know the classical smoothing methods in times series analysis; and,Interpret particular computer outputs of Microsoft Excel. Textbook Elementary Statistics By Almeda, Capistrano, and Sarte Assignment 1. Fill up attendance card. Follow the format below: Second Semester, AY 2013-2014 Stat 115 (Section) Name: LAST NAME, FIRST NAME Student No: _________________ Grades: Math 17 ________ Stat 114 ________ 1.5x1.5 to 2x2 Photo Nickname (in ALL CAPS) 2. Read course syllabus posted at http://www.stat.upd.edu.ph/fcapistrano.htm. Site also contains presentation materials and sample exams. 3. Make sure you belong in a group with 3 to 4 members. Chapter 1 PRELIMINARIES Population vs Sample Population data = {X1, X2, …, XN} Parameter: a summary measure describing a particular characteristic of the population that is computed using population data N N Xi Examples: i 1 population mean = N (Xi population variance = 2 i 1 N Sample data = {X1, X2, …, Xn} Statistic: a summary measure describing a particular characteristic of the sample that is computed using sample data n n (X i X) 2 Xi Examples: )2 sample mean = X i 1 n 2 sample variance = S i 1 n 1 Notes About the Parameter and Statistic Both the parameter and statistic are summary measures that are computed using data. If you have population data then the computed summary measure is a parameter. If you only have sample data then the computed summary measure is a statistic. In a statistical inquiry, the answer to the research problem is based on the value of the parameter that describes the characteristic of interest of the population under study. However, the value of this parameter can only be computed using population data. If you only have sample data, you cannot compute for the value of the parameter. Descriptive Statistics vs Inferential Statistics Descriptive Statistics comprise those methods concerned with collecting, describing, and analyzing a set of data without drawing conclusions or inferences about a larger group Inferential Statistics comprise those methods concerned with the analysis of sample data leading to predictions or inferences about the population Notes About Inferential Statistics Although we cannot compute for the value of the parameter using sample data, we can use the methods on Inferential Statistics to infer on the value of this parameter. In Inferential Statistics, we compute for the value of the statistic using sample data not for the purpose of describing the sample but so that we can infer on the value of the parameter of interest. It should be clear that we base our inferences on partial information about the population. Thus, whatever inferences we make will always be subject to some error. A background on probability theory and distribution theory will help us understand the errors that we commit in Inferential Statistics. Random Experiment (page 284) Definition 10.4 A random experiment is a process that can be repeated under similar conditions whose outcome cannot be predicted with certainty beforehand. Examples: tossing of a coin, tossing a die, drawing cards from a standard deck of cards, selecting a sample from the population using probability sampling methods Sample Space (pages 285-286) Definition 10.5 The sample space, denoted by (Greek letter omega), is the collection of all possible outcomes of a random experiment. An element of the sample space is called a sample point. Examples: Examples 10.1 and 10.2 Example (pages 86-87) Recall: Simple random sampling is a probability sampling method wherein all possible subsets consisting of n elements selected from the N elements of the population have the same chances of selection. In simple random sampling without replacement (SRSWOR), all the n elements in the sample must be distinct from each other. In simple random sampling with replacement (SRSWR), the n elements in the sample need not be distinct, that is, an element can be selected more than once to be a part of the sample. Example cont’d Example 3.8: a) Suppose the population consists of N=5 children: a=Janine, b=Josiel, c=Jan, d=Eryl, and e=Eariel. Suppose a sample of size n=2 will be selected using SRSWOR. Specify the sample space. We will denote a sample of size 2 by the set {x1, x2}, where x1 and x2 are the two distinct elements included in the sample. = {{a,b}, {a,c}, {a,d}, {a,e}, {b,c}, {b,d}, {b,e}, {c,d}, {c,e}, {d,e}} By definition of SRSWOR, all the 10 sample points (samples) will be given equal chances of selection. In general, a sample of size n selected using SRSWOR will be denoted by a set containing n distinct elements, {x1,x2,…,xn}, where the xis are the elements selected in the sample. When the sample of size n is selected from a population of size N using SRSWOR, then the sample space will contain (N(N-1)(N-2)…(N-n+1))/(n(n-1)(n-2)…(2)(1)) sets containing n elements, and by definition, all of them will be given equal chances of selection. Example cont’d Example 3.8: b) Suppose the population consists of N=5 children: a=Janine, b=Josiel, c=Jan, d=Eryl, and e=Eariel. Suppose a sample of size n=2 will be selected using SRSWR. Specify the sample space. We will denote a sample of size 2 by an ordered pair, (x1, x2), where x1 is the element selected on the first draw while x2 is the element selected on the second draw. = {(a,a), (a,b), (a,c), (a,d), (a,e), (b,a), (b,b), (b,c), (b,d), (b,e), (c,a), (c,b), (c,c), (c,d), (c,e), (d,a), (d,b), (d,c), (d,d), (d,e), (e,a), (e,b), (e,c), (e,d), (e,e)} By definition of SRSWR, all the 25 sample points (samples) will be given equal chances of selection. In general, a sample of size n selected using SRSWR will be denoted by an ordered n-tuple (x1,x2,…,xn) where xi is the element selected on the ith draw. When the sample of size n is selected from a population of size N using SRSWR, then the sample space will contain Nn ordered n-tuples and by definition, all of them will be given equal chances of selection. Assignment 1 Suppose the population consists of N=6 elements: a, b, c, d, e, and f. Suppose a sample of size n=3 will be selected. Specify the sample space by roster method for the following sampling schemes: 1. 2. 3. SRSWOR SRSWR Systematic sampling (Recall: Since n is a divisor of N then k=N/n. Select the starting point at random, from 1 to k then take every kth element thereafter.) Note: Under systematic sampling where n is a divisor of N, the sample space will contain k=N/n equally likely sample points. Event (pages 287-289) Definition 10.6 An event is a subset of the sample space whose probability is defined. We say that an event occurred if the outcome of the random experiment is one of the sample points belonging in the event; otherwise, the event did not occur. We will denote events by capital latin letters. There are two special events: Sure event: Impossible event: Examples Example 10.3 (page 287) The population consists of N=5 children: a=Janine, b=Josiel, c=Jan, d=Eryl, and e=Eariel. Suppose a sample of size n=2 will be selected using SRSWR. = {(a,a), (a,b), (a,c), (a,d), (a,e), (b,a), (b,b), (b,c), (b,d), (b,e), (c,a), (c,b), (c,c), (c,d), (c,e), (d,a), (d,b), (d,c), (d,d), (d,e), (e,a), (e,b), (e,c), (e,d), (e,e)} A = event that Janine is included in the sample = {(a,a), (a,b), (a,c), (a,d), (a,e), (b,a), (c,a), (d,a), (e,a)} B = event that Janine and Jan are both included in the sample = {(a,c), (c,a)} Suppose the sample selected was (a,d)? Did event A occur? Did event B occur? Probability of an Event (pages 292-293) Definition 10.9. The probability of an event A, denoted by P(A), is a function that assigns a measure of chance that event A will occur and must satisfy the following properties: i. ii. iii. 0 ≤ P(A) ≤ 1 P( ) = 1 and P( ) = 0 Finite Additivity. If event A can be expressed as the union of n non-overlapping events, A1, A2, …, An, then P(A)=P(A1)+P(A2)+…+P(An) Interpretation: A probability measure that is close to 1 means that the event has a very large chance of occurrence. On the other hand, if the probability measure is close to 0, then the event has a very small chance of occurrence. A probability of 0.5, the midpoint of the interval [0,1] means that the chance that the event will occur is the same as the chance that the event will not occur. A Priori Probability or Classical Definition of Probability (page 297) Definition 10.10 If a random experiment can result in any one of N different equally likely outcomes and if exactly n of these outcomes belong in event A, then P(A) no. of elements in A no. of elements in n N Examples Example 10.9 (page 298) The population consists of N=5 children: a=Janine, b=Josiel, c=Jan, d=Eryl, and e=Eariel. Suppose a sample of size n=2 will be selected using SRSWR. = {(a,a), (a,b), (a,c), (a,d), (a,e), (b,a), (b,b), (b,c), (b,d), (b,e), (c,a), (c,b), (c,c), (c,d), (c,e), (d,a), (d,b), (d,c), (d,d), (d,e), (e,a), (e,b), (e,c), (e,d), (e,e)} By definition of SRSWR, the sample space contains equally likely outcomes. A = event that Janine is included in the sample = {(a,a), (a,b), (a,c), (a,d), (a,e), (b,a), (c,a), (d,a), (e,a)} P(A) = 9/25 B = event that Janine and Jan are both included in the sample = {(a,c), (c,a)} P(B) = 2/25 Probabilities and Proportions (page 298) The classical definition of probability allows us to view proportions in terms of probabilities and vice versa. Consider the experiment wherein we randomly select one element from a population under study and observe if the selected element possesses the characteristic of interest. We can define the sample space as the collection of all elements in the population. Let A=event that the selected element possesses the characteristic of interest. Then by the classical definition of probability: P(A) no. of elements in A no. of elements in prop'n of elts in popn that possess the characteristic Example (page 298) Example 10.10. According to a study conducted by the Food and Nutrition Research Institute (FNRI) in 1998, 30% of children in Central Luzon aged 6 months to 5 years old are afflicted with iron deficiency anemia while 50% have low to deficient Vitamin A levels. Define =set of children in Central Luzon aged 6 mos. to 5 years A = event of selecting a child who has iron deficiency anemia B = event of selecting a child with low to deficient Vit. A levels Then based on the given data, we can write P(A) = 0.3 and P(B) = 0.5 A Posteriori Probability (page 299) Definition 10.11 If a random experiment is repeated many times under uniform conditions, use the empirical probability of event A to assign its probability as follows: no. of times event A occurred empirical P(A) = no. of times experiment was repeated The a posteriori definition of the probability of event A is the limiting value of its empirical probability if we repeat the process endlessly. Note: The observed relative frequency of occurrence of event A is just the empirical probability of event A. This empirical probability will be a good approximate of the actual probability if we perform the process a large number of times under uniform conditions. Using Excel to Simulate the Experiment Step 1: Step 2: Step 3: Step 4: In Column A, list all possible numeric outcomes in a single stage of the random experiment. In Column B, write the corresponding probabilities. Select Data then click Data Analysis. Choose Random Number Generation. Fill-up dialogue box. Number of variables: no. of stages of the random experiment or no. of times an outcome is selected in a single trial Number of random nos: total number of trials Distribution: Discrete Value and Probability: Cells containing outcomes & probabilities Random Seed: any positive integer less than 215 Exercise for Section 10.3 (page 302) 1. Consider the experiment of tossing a fair die twice. Specify the following events using the roster method and compute for their a priori probabilities: a. A=event where the sum of the number of dots is less than 5 b. B=event of observing 2 dots on the first toss c. C=event of observing the same number of dots on both tosses 2. Perform the experiment of tossing a fair die twice 1,000 times and under uniform conditions. Approximate the probabilities of the events described in Problem no. 1 by computing for the empirical probabilities. Assignment 2 Answer Exercise 1b and 1c. Use Microsoft Excel and the seed number assigned to your group to answer Exercise 2. Random Variable (page 327) Definition 10.18 A function whose value is a real number that is determined by each sample point in the sample space is called a random variable. An uppercase letter, say X, will be used to denote a random variable and its corresponding lowercase letter, x in this case, will be used to denote one of its values. Note: The use of the term variable is consistent with the way we use this word in mathematics and the way we defined it in Chapter 1 as the characteristic of interest whose value varies. The addition of the term “random” emphasizes the requirement that the realized or actual value of the random variable depends on the outcome of a random experiment. Consequently, it is impossible to predict with certainty what the realized value of the random variable X will be. Example 10.34 (page 328) Filipinos are so fascinated with elections and the polls conducted to predict the outcomes of these elections. For illustration purposes, let us imagine a very small barangay consisting of 6 qualified voters. Let’s label these voters as A1, A2, A3, A4, A5, and A6. There are two candidates vying for the position, say Renzo and Sandro. What we do not know is that voters A1, A2, A3 and A4 have already decided to elect Renzo while voters A5 and A6 will elect Sandro. We only have enough resources to get a sample of size 3. We will then use the information from this sample to predict the outcome of the election. Suppose we use SRSWOR to select our sample of size 3. Our sample space will contain all the 20 possible subsets of size 3. The sample points in our sample space are: {A1,A2,A3} {A1,A2,A4} {A1,A2,A5} {A1,A2,A6} {A1,A3,A4} {A1,A3,A5} {A1,A3,A6} {A1,A4,A5} {A1,A4,A6} {A1,A5,A6} {A2,A3,A4} {A2,A3,A5} {A2,A3,A6} {A2,A4,A5} {A2,A4,A6} {A2,A5,A6} {A3,A4,A5} {A3,A4,A6} {A3,A5,A6} {A4,A5,A6} Define X=number of voters who will elect Renzo. X is a random variable. Its realized value depends on the outcome of the random experiment. {A1,A2,A3} {A1,A2,A4} {A1,A2,A5} {A1,A2,A6} {A1,A3,A4} {A1,A3,A5} {A1,A3,A6} {A1,A4,A5} {A1,A4,A6} {A1,A5,A6} 3 3 2 2 3 2 2 2 2 1 {A2,A3,A4} {A2,A3,A5} {A2,A3,A6} {A2,A4,A5} {A2,A4,A6} {A2,A5,A6} {A3,A4,A5} {A3,A4,A6} {A3,A5,A6} {A4,A5,A6} 3 2 2 2 2 1 2 2 1 1 Using the Random Variable to Express the Event of Interest (page 329) We will use the notation, X ≤ x, to express the event containing all sample points whose associated value for the random variable X is less than or equal to x, where x is a specified real number. We will use the notation, X > x, to express the event containing all sample points whose associated value for X is greater than x. We will use the notation, a<X<b, to express the event containing all sample points whose associated value for X is in between a and b, where a and b are specified real numbers. And so on. Example 10.35 (page 329) Define X=number of voters who will elect Renzo {A1,A2,A3} {A1,A2,A4} {A1,A2,A5} {A1,A2,A6} {A1,A3,A4} {A1,A3,A5} {A1,A3,A6} {A1,A4,A5} {A1,A4,A6} {A1,A5,A6} 3 3 2 2 3 2 2 2 2 1 {A2,A3,A4} {A2,A3,A5} {A2,A3,A6} {A2,A4,A5} {A2,A4,A6} {A2,A5,A6} {A3,A4,A5} {A3,A4,A6} {A3,A5,A6} {A4,A5,A6} 3 2 2 2 2 1 2 2 1 1 A = event of selecting a sample with 1 voter electing Renzo = {{A1,A5,A6}, {A2,A5,A6},{A3,A5,A6}, {A4,A5,A6}} Event A can be expressed as X=1. B = event of selecting a sample with more than 2 voters electing Renzo = {{A1,A2,A3}, {A1,A2,A4}, {A1,A3,A4}, {A2,A3,A4}} Event B can be expressed as X>2. C = event of selecting a sample with at least 1 voter electing Renzo = = sure event Event C can be expressed as X ≥ 1 D = event of selecting a sample with 5 voters electing Renzo = = impossible event. Event D can be expressed as X=5. Discrete Random Variable and Its PMF (pages 330 & 332) Definition 10.20 If a sample space contains a finite number of sample points or has as many sample points as there are counting/natural numbers then it is called a discrete sample space. Definition 10.21 A random variable defined over a discrete sample space is called a discrete random variable. Definition 10.22 The probability mass function (PMF) of a discrete random variable, denoted by f(.), is a function defined for any real number x as: f(x) = P(X = x). The values of the discrete random variable X for which f(x)>0 are called its mass points. Example 10.38 (pages 332-333) Define X=number of voters who will elect Renzo {A1,A2,A3} {A1,A2,A4} {A1,A2,A5} {A1,A2,A6} {A1,A3,A4} {A1,A3,A5} {A1,A3,A6} {A1,A4,A5} {A1,A4,A6} {A1,A5,A6} 3 3 2 2 3 2 2 2 2 1 {A2,A3,A4} {A2,A3,A5} {A2,A3,A6} {A2,A4,A5} {A2,A4,A6} {A2,A5,A6} {A3,A4,A5} {A3,A4,A6} {A3,A5,A6} {A4,A5,A6} 3 2 2 2 2 1 2 2 1 1 X is a discrete random variable. The range of X={1,2,3}. The elements in the range of X are the mass points of the discrete random variable X. To derive the PMF of X, we need to compute P(X=x) for all x that are mass points of X. Since the sample space contains equally likely outcomes then we can use the classical definition to compute for these probabilities, that is, P(A) = no. of sample points in A/ no. of sample points in . x 1 Event Associated with X=x P(X=x) {{A1,A5,A6}, {A2,A5,A6},{A3,A5,A6}, {A4,A5,A6}} 4/20 = 1/5 2 {{A1,A2,A5}, {A1,A2,A6}, {A1,A3,A5}, {A1,A3,A6}, {A1,A4,A5}, {A1,A4,A6}, {A2,A3,A5}, {A2,A3,A6},{A2,A4,A5}, {A2,A4,A6}, {A3,A4,A5}, {A3,A4,A6}} 12/20 = 3/5 3 {{A1,A2,A3}, {A1,A2,A4}, {A1,A3,A4}, {A2,A3,A4}} 4/20 = 1/5 The PMF of X can be presented in tabular form as follows: x 1 2 3 f(x) 1/5 3/5 1/5 Continuous Random Variable and Its PDF (pages 335-336) For a continuous random variable, X, the P(X=x) will always be 0 for any real number x. This property may sound strange but this property will be satisfied by variables that are measured using some standard measurement of scale of real numbers or nonnegative real numbers such as: inches, centimeters, degrees Fahrenheit, degrees Celsius, pints, liters, grams, ounces. Any particular measure taken on such scales could be recorded to as many decimal places as one might care to take it. So if you take any interval containing the point x on such scales, even if this interval is very, very short, this interval will always contain infinitely many other points on the scale and the point x is just one of them. Definition 10.23 The probability density function (PDF) of a continuous random variable X, denoted by f(.), is a function that is defined for any real number x and satisfy the following properties: a) f(x) 0 for all x; b) the area below the whole curve, f(x), and above the x-axis is always equal to 1; and, c) P(a ≤ X ≤ b) is the area bounded by the curve f(x), the x-axis and the lines x=a and x=b. Graph of the PDF (page 336) The graph of the PDF is always above the x-axis because the function cannot take on negative values. If we remove the lines x=a and x=b and measure the whole area below f(x) and above the x-axis, this area is always exactly equal to 1. The shaded area which is bounded by the curve f(x), the x-axis, and the lines x=a and x=b, represents the P(a ≤ X ≤ b). We can also see from this illustration the reason why we stated earlier that for a continuous random variable X, the P(X=x)=0 for any real number x. P(X=a) is just the same as P(a ≤ X ≤ a). In this case, we will let b=a. Then, the area representing P(X=a) will be 0 because we will only be left with a single line. Notes About the PMF and PDF The PMF of the discrete random variable and the PDF of the continuous random variable are what we refer to as the distribution of the random variable. The distribution of the random variable X provides us with complete information about the behavior of the random variable X. Although we cannot predict with certainty what the realized value of the random variable X will be, we can use its distribution to compute for the probability of any event expressed in terms of the random variable X. We will learn how to do this in Stat 121. In Stat 115, we will use the CDF of a continuous random variable to compute for probabilities. Cumulative Distribution Function (page 330) Definition 10.19 The cumulative distribution function (CDF) of a random variable X, denoted by F(.) is a function defined for any real number x as F(x) = P(X x) Notes About the CDF The CDF of the random variable X is also referred to as its distribution. Just like the PMF of a discrete random variable and the PDF of a continuous random variable, the CDF provides us with complete information about the behavior of the random variable. We can use it to compute for the probability of any event expressed in terms of the random variable X. (page 339) When X is a continuous random variable, we can express the probability of the event in terms of the CDF as follows: P(X a) = P(X < a) = F(a). P(X > a) = P(X a) = 1 – F(a). P(a < X < b) = P(a X b) = P(a X < b) = P(a < X b) = F(b) – F(a). Examples Example 10.41 (page 339) Exercise 4. (page 340) Given the CDF of a continuous random variable X, find the following probabilities using the CDF: a. P(X>0.25) b. P(0.3<X<0.7) c. P(0.4≤X<1.25) F ( x) 1 when x 1 x3 0 when 0 x 1 when x 0 Mean and Variance of the Discrete Random Variable X (pages 341 and 344) Suppose X is a discrete random variable with probability mass function: x f(x) = P(X=x) x1 f(x1) x2 f(x2) x3 f(x3) … … xn f(xn) Definition 10.24 Definition 10.25 The mean of the discrete random variable X, also called the expected value of X is The variance of the discrete random variable X is X = E(X) = x1f(x1) + x2f(x2) +…+ xnf(xn). The mean tells us where the center of mass is located. In other words, it tells us the average value of X if we repeat the random experiment endlessly. 2 X = Var(X) = (x1 - x)2f(x1) + (x2 … + (xn - x)2f(xn) x )2f(x2) + The variance measures how close the values of X are around the mean. Examples Examples 10.42 and 10.45: Let X=number of voters who will elect Renzo, as defined in Example 10.34. The PMF of this random variable as derived earlier is as follows: x 1 2 3 f(x) 1/5 3/5 1/5 Use this PMF to find the mean and variance of X. Solution: X = E(X) = (1)(1/5) +(2)(3/5) +(3)(1/5) =2. Suppose we keep on repeating the process of selecting samples of size 3 and each time observe how many will vote for Renzo. The average of these values, or the average number of voters who will elect Renzo, is 2. 2 X = Var(X) = (1 – 2)2(1/5) + (2 – 2)2(3/5) + (3 – 2)2(1/5) = 0.4 Assignment 3 Exercise 2 (page 348) Given the CDF of a continuous random variable X, compute for the following probabilities: 1. 2. a) b) c) d) e) P(X ≤ 0.4) P(X > 0.8) P(X ≥ -0.6) P(-0.5 < X < 0.2) P(-0.1< X ≤ 2.5) Always show your solution. Write the formula in terms of the CDF first then present how you have plugged-in the appropriate values to compute the probability. 0, x2 2x 1 , 2 x 2 2x 1 , 2 F(x) 1, if x 1 if 1 x if 0 0 x 1 if x 1 Normal Distribution (page 349) Definition 10.27 (page 349) A continuous random variable X is said to be normally distributed if its probability density function is given by : f ( x) 1 2 e 1 x 2 2 for any real number x. The constants, and 2, are such that - < < and 2>0. The values, e and , are mathematical constants, wherein, e 2.71828 and 3.14159. The normal distribution has 2 parameters, namely, µ and 2. As stated in Definition 10.28, a parameter in distribution theory is a constant that determines the specific form of the probability distribution. We can view the parameter as a numerical descriptive measure of the probability distribution. It carries vital information about the probability distribution. The shape of the distribution, the location of its center, the value of its variance, and other characterizations of the distribution all depend on the value of the parameter. We will notice that this notion is consistent with our previous concept of a parameter as a summary measure describing a specific characteristic of the population in Stat 114. This time though, our population is specifically the set of all realized values of X if we were to repeat the random experiment endlessly. Graph of the Normal PDF (page 350) 0.68 0.95 >0.99 -3 -2 -1 +1 +2 +3 Bell-shaped curve that is symmetric about µ. The area bounded by the curve and the x-axis is 1. The curve will approach the x-axis as we proceed in either direction away from µ, but will never touch the x-axis. In Stat 114, if X~Normal( , 2) a) P( - 1 < X < b) P( - 2 < X < c) P( - 3 < X < then + 1 ) 0.68 + 2 ) 0.95 + 3 ) > 0.99 Standard Normal Random Variable (page 351) Definition 10.29 If the normal random variable has mean 0 and variance 1, it is called a standard normal random variable and is denoted by Z. Standard Normal Table, Table B.1 (pages 603-604) Table B.1 presents the values of the CDF of a standard normal random variable or P(Z ≤ z). We can compute for the probability of any event expressed in terms of the standard normal random variable using these formulas: P(Z a) = P(Z < a) = F(a) P(Z > a) = P(Z a) = 1 – F(a) P(a<Z<b) = P(a Z b) = P(a Z<b) = P(a<Z b) = F(b) – F(a) where Z is the standard normal random variable. See Examples 10.49 and 10.50 (page 352) Property of the Normal Distribution Any random variable X that follows a normal distribution with mean and variance 2 can be transformed into a standard normal random variable Z with mean 0 and variance 1. The transformation is the familiar formula that we use to compute for the z-score in Stat 114: Z X Computing for Probabilities (page 353) If X~N(µ, 2) then P(X≤a)= P X a = P Z a , where Z is a standard normal random variable. Example 10.51. . Suppose X~Normal(µ=5, 6) = P 5 =4). 6 5 P( Z 0.5) 0.6915. 2 2 4.5 5 X 5 6 5 b) P(4.5 < X < 6)= P = P(-0.25<Z<0.5) 2 2 2 =F(0.5) – F(-0.25) = 0.6915 – 0.4013 = 0.2902. X 5 4.5 5 c) P(X > 4.5) = P = P(Z>-0.25) = 1 – F(-0.25) 2 2 =1– 0.4013 = 0.5987. a) P(X X 2 Checking the 68-95-99 Rule Suppose X~Normal( , 2) P 1 X 1 P 1 X 1 F(1) F( 1) P 2 X 2 P 2 3 X 3 P 3 X 1 0.8413 0.1587 X F(2) F( 2) P P X F(3) F( 3) 2 P P P( 1 Z 1) 0.6826 2 0.9772 0.0228 3 1 3 0.9987 0.0013 X 2 P( 2 Z 2) 0.9544 X 0.9974 3 P( 3 Z 3) z Value of the Standard Normal Random Variable (page 354) The value z (read as “z sub alpha”) satisfies the condition that P(Z > z ) = . This is equivalent to saying that P(Z ≤ z ) = 1 - . 1- 2 -z 0 z z Values: Bottom of Table B.1 (page 604) .10 .05 .025 .01 .005 .001 .0005 .00005 z 1.282 1.645 1.960 2.326 2.576 3.090 3.291 3.891 Example: z.05 = 1.645 l P(Z > 1.645)=0.05, P(Z < 1.645) =0.95, P(Z < -1.645) = 0.05, P(-1.645 < Z < 1.645) = 0.90. 0.90 0.05 -1.645 0.05 0 1.645 Importance of the Normal Distribution (page 354) The normal distributions or at least approximately normal distributions occur in many situations. Many physical and mental traits tend to be at least approximately normally distributed. If it is not X that is normally distributed, it is some transformation of X that is normal. Furthermore, as a consequence of the Central Limit Theorem, the normal distribution is also used to model characteristics of interest that are believed to be the result of summing up a large number of small effects that are independently generated by a process. Examples Examples 10.53 and 10.54 (page 355) Exercise 1 (page 372) A wine’s distinctive taste is a result of ageing it in wooden casks. Some of the wine evaporates while it is aging in the porous wooden casks. Define X=percentage of wine in the cask that is lost due to evaporation. Suppose X is normally distributed with mean 5% and a standard deviation of 1%. What is the probability of losing more than 7.5% of the wine due to evaporation? Always define random variable: X=percentage of wine in the cask that is lost due to evaporation Identify distribution of X: Given: X~Normal(µ=5, 2=12) Express problem in terms of the defined random variable: Find P(X>7.5). P( X 7.5) P X 5 1 7.5 5 1 P( Z 2.5) 1 F (2.5) 1 .9938 0.0062. More Examples Exercise 3 (page 373) Suppose that the IQ’s of applicants of a certain science high school follow a normal distribution with mean of 120 and a standard deviation of 9. a) One of the requirements of the school in accepting a student is that the student’s IQ must be at least 115. What proportion of the applicants will be rejected on the basis of their IQ? X=IQ of selected applicant Given: X~Normal(µ=120, 2=92) a) Find P(X<115). P( X 115) P X 120 9 115 120 9 P( Z 5 / 9) P( Z .56) F ( 0.56) 0.2877 Assignment 4 1. If Z is a standard normal random variable, find the value of z0 that satisfies each of the following probability statements: a. P(Z < z0) = 0.99995 b. P(Z > z0) = 0.99 c. P(-z0 < Z < z0) =0.998 2. 3. The existing machine setting of a factory produces bearings with a diameter that is normally distributed with mean and standard deviation equal to 3.0005 cm and 0.0010 cm, respectively. Customer specifications require the bearing diameters to lie in the interval [2.998, 3.002]. Those outside the interval are considered scrap and must be remachined. With the existing machine setting, what fraction of total production will not be considered as scrap? The length of time for a college applicant to complete the college achievement test is normally distributed with mean equal to 70 minutes and variance 144 min2. What is the probability of selecting an applicant at random who will take more than 85 minutes to complete the test?