Applied Statistics I Liang Zhang Department of Mathematics, University of Utah June 10, 2008 Liang Zhang (UofU) Applied Statistics I June 10, 2008 1 / 37 Measures of Variability Sample I: Sample II: Sample III: 30, 35, 40, 45, 50, 55, 60, 65, 70 30, 41, 48, 49, 50, 51, 52, 59, 70 41, 45, 48, 49, 50, 51, 52, 55, 59 Liang Zhang (UofU) Applied Statistics I June 10, 2008 2 / 37 Measures of Variability Sample I: Sample II: Sample III: 30, 35, 40, 45, 50, 55, 60, 65, 70 30, 41, 48, 49, 50, 51, 52, 59, 70 41, 45, 48, 49, 50, 51, 52, 55, 59 Liang Zhang (UofU) Applied Statistics I June 10, 2008 2 / 37 Measures of Variability Sample Range: the difference between the largest and the smallest sample values. Liang Zhang (UofU) Applied Statistics I June 10, 2008 3 / 37 Measures of Variability Sample Range: the difference between the largest and the smallest sample values. e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70 the sample range is 40(= 70 − 30). Liang Zhang (UofU) Applied Statistics I June 10, 2008 3 / 37 Measures of Variability Sample Range: the difference between the largest and the smallest sample values. e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70 the sample range is 40(= 70 − 30). Deviation from the Sample Mean: the diffenence between the individual sample value and the sample mean. Liang Zhang (UofU) Applied Statistics I June 10, 2008 3 / 37 Measures of Variability Sample Range: the difference between the largest and the smallest sample values. e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70 the sample range is 40(= 70 − 30). Deviation from the Sample Mean: the diffenence between the individual sample value and the sample mean. e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70 the sample mean is 50 and thus the deviation from the sample mean for each data is -20, -15, -10, -5, 0, 5, 10, 15, 20. Liang Zhang (UofU) Applied Statistics I June 10, 2008 3 / 37 Measures of Variability Sample Variance: the mean (or average) of the sum of squares of the deviations from the sample mean for each individual data. Liang Zhang (UofU) Applied Statistics I June 10, 2008 4 / 37 Measures of Variability Sample Variance: the mean (or average) of the sum of squares of the deviations from the sample mean for each individual data. If our sample size is n, and we use x̄ to denote the sample mean, then the sample variance s 2 is given by: Pn (xi − x̄)2 Sxx s 2 = i=1 = n−1 n−1 Liang Zhang (UofU) Applied Statistics I June 10, 2008 4 / 37 Measures of Variability Sample Variance: the mean (or average) of the sum of squares of the deviations from the sample mean for each individual data. If our sample size is n, and we use x̄ to denote the sample mean, then the sample variance s 2 is given by: Pn (xi − x̄)2 Sxx s 2 = i=1 = n−1 n−1 Sample Standard Deviation: the square root of the sample variance s= Liang Zhang (UofU) √ s2 Applied Statistics I June 10, 2008 4 / 37 Measures of Variability e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70, the mean is 50 and we have xi 30 35 40 45 50 55 60 65 70 xi − x̄ -20 -15 -10 -5 0 5 10 15 20 (xi − x̄)2 400 225 100 25 0 25 100 225 400 Therefore the sample variance is (400 + 225 + 100 + 25 + 0 + 25 √ + 100 + 225 + 400)/(9 − 1) = 187.5 and the standard deviation is 187.5 = 13.7. Liang Zhang (UofU) Applied Statistics I June 10, 2008 5 / 37 Measures of Variability e.g. for Sample II: 30, 41, 48, 49, 50, 51, 52, 59, 70, the mean is also 50 and we have xi 30 41 48 49 50 51 52 59 70 xi − x̄ -20 -9 -2 -1 0 1 2 9 20 (xi − x̄)2 400 81 4 1 0 1 4 81 400 Therefore the sample variance is (400 + 81 + 4 + 1 + 0 + 1 + 4√+ 81 + 400)/(9 − 1) = 121.5 and the standard deviation is 121.5 = 11.0. Liang Zhang (UofU) Applied Statistics I June 10, 2008 6 / 37 Measures of Variability e.g. for Sample III: 41, 45, 48, 49, 50, 51, 52, 55, 59, the mean is also 50 and we have xi 41 45 48 49 50 51 52 55 59 xi − x̄ -9 -5 -2 -1 0 1 2 5 9 2 (xi − x̄) 81 25 4 1 0 1 4 25 81 Therefore the sample variance is (81 + 25 + 4 + 1 + 0 + 1 + 4 + √ 25 + 81)/(9 − 1) = 27.75 and the standard deviation is 27.75 = 4.9. Liang Zhang (UofU) Applied Statistics I June 10, 2008 7 / 37 Measures of Variability sample variance for Sample I is 187.5, for Sample II is 121.5 and for Sample III is 27.75. Liang Zhang (UofU) Applied Statistics I June 10, 2008 8 / 37 Measures of Variability Remark: 1. Why use the sum of squares of the deviations? Why not sum the deviations? Liang Zhang (UofU) Applied Statistics I June 10, 2008 9 / 37 Measures of Variability Remark: 1. Why use the sum of squares of the deviations? Why not sum the deviations? Because the sum of the deviations from the sample mean EQUAL TO 0! Liang Zhang (UofU) Applied Statistics I June 10, 2008 9 / 37 Measures of Variability Remark: 1. Why use the sum of squares of the deviations? Why not sum the deviations? Because the sum of the deviations from the sample mean EQUAL TO 0! n n n X X X (xi − x̄) = xi − x̄ i=1 i=1 = = n X i=1 n X i=1 xi − nx̄ n xi − n( i=1 1X xi ) n i=1 =0 Liang Zhang (UofU) Applied Statistics I June 10, 2008 9 / 37 Measures of Variability Remark: 2. Why do we use divisor n − 1 in the calculation of sample variance while we use use divisor N in the calculation of the population variance? Liang Zhang (UofU) Applied Statistics I June 10, 2008 10 / 37 Measures of Variability Remark: 2. Why do we use divisor n − 1 in the calculation of sample variance while we use use divisor N in the calculation of the population variance? The variance is a measure about the deviation from the “center”. However, the “center” for sample and population are different, namely sample mean and population mean. Liang Zhang (UofU) Applied Statistics I June 10, 2008 10 / 37 Measures of Variability Remark: 2. Why do we use divisor n − 1 in the calculation of sample variance while we use use divisor N in the calculation of the population variance? The variance is a measure about the deviation from the “center”. However, the “center” for sample and population are different, namely sample mean and population mean. P If we use µ instead of x̄ in the definition of s 2 , then s 2 = (xi − µ)/n. Liang Zhang (UofU) Applied Statistics I June 10, 2008 10 / 37 Measures of Variability Remark: 2. Why do we use divisor n − 1 in the calculation of sample variance while we use use divisor N in the calculation of the population variance? The variance is a measure about the deviation from the “center”. However, the “center” for sample and population are different, namely sample mean and population mean. P If we use µ instead of x̄ in the definition of s 2 , then s 2 = (xi − µ)/n. But generally, population mean is unavailable to us. So our choice is the sample mean. In that case, the observations xi0 s tend to be closer to their average x̄ then to the population average µ. So to compensate, we use divisor n − 1. Liang Zhang (UofU) Applied Statistics I June 10, 2008 10 / 37 Measures of Variability Remark: 3. It’ customary to refer to s 2 as being based on n − 1 degrees of freedom (df). Liang Zhang (UofU) Applied Statistics I June 10, 2008 11 / 37 Measures of Variability Remark: 3. It’ customary to refer to s 2 as being based on n − 1 degrees of freedom (df). s 2 is the average of n quantities: (x1 − x̄)2 , (x2 − x̄)2 , . . . , (xn − x̄)2 . However, the sum of x1 − x̄, x2 − x̄, . . . , xn − x̄ is 0. Therefore if we know any n − 1 of them, we know all of them. Liang Zhang (UofU) Applied Statistics I June 10, 2008 11 / 37 Measures of Variability Remark: 3. It’ customary to refer to s 2 as being based on n − 1 degrees of freedom (df). s 2 is the average of n quantities: (x1 − x̄)2 , (x2 − x̄)2 , . . . , (xn − x̄)2 . However, the sum of x1 − x̄, x2 − x̄, . . . , xn − x̄ is 0. Therefore if we know any n − 1 of them, we know all of them. e.g. {x1 = 4, x2 = 7, x3 = 1, and x4 = 10}. Liang Zhang (UofU) Applied Statistics I June 10, 2008 11 / 37 Measures of Variability Remark: 3. It’ customary to refer to s 2 as being based on n − 1 degrees of freedom (df). s 2 is the average of n quantities: (x1 − x̄)2 , (x2 − x̄)2 , . . . , (xn − x̄)2 . However, the sum of x1 − x̄, x2 − x̄, . . . , xn − x̄ is 0. Therefore if we know any n − 1 of them, we know all of them. e.g. {x1 = 4, x2 = 7, x3 = 1, and x4 = 10}. Then the mean is x̄ = 5.5 and x1 − x̄ = −1.5, x2 − x̄ = 1.5 and x3 − x̄ = −4.5. From that, we know directly that x4 − x̄ = 4.5 since their sum is 0. Liang Zhang (UofU) Applied Statistics I June 10, 2008 11 / 37 Measures of Variability Some mathematical results for s 2 : Liang Zhang (UofU) Applied Statistics I June 10, 2008 12 / 37 Measures of Variability Some mathematical results for s 2 : P P Sxx s 2 = n−1 where Sxx = (xi − x̄)2 = xi2 − Liang Zhang (UofU) Applied Statistics I P ( xi )2 ; n June 10, 2008 12 / 37 Measures of Variability Some mathematical results for s 2 : P P Sxx s 2 = n−1 where Sxx = (xi − x̄)2 = xi2 − If y1 = x1 + c, y2 = x2 + c, . . . , yn = xn + c, Liang Zhang (UofU) Applied Statistics I P ( xi )2 ; n then sy2 = sx2 ; June 10, 2008 12 / 37 Measures of Variability Some mathematical results for s 2 : P P Sxx s 2 = n−1 where Sxx = (xi − x̄)2 = xi2 − If y1 = x1 + c, y2 = x2 + c, . . . , yn = xn + c, P ( xi )2 ; n then sy2 = sx2 ; If y1 = cx1 , y2 = cx2 , . . . , yn = cxn , then sy =| c | sx . Here sx2 is the sample variance of the x’s and sy2 is the sample variance of the y ’s. c is any nonzero constant. Liang Zhang (UofU) Applied Statistics I June 10, 2008 12 / 37 Measures of Variability e.g. in the previous example, Sample III is {41, 45, 48, 49, 50, 51, 52, 55, 59} then we can calculate the sample variance as following xi 41 45 48 49 50 51 52 55 59 2 x 1681 2025 2304 2401 2500 2601 2704 3025 3481 Pi P x2i 450 xi 22722 Therefore the sample variance is (22722 − Liang Zhang (UofU) 4502 )/(9 − 1) = 27.75 9 Applied Statistics I June 10, 2008 13 / 37 Measures of Variability Boxplots Liang Zhang (UofU) Applied Statistics I June 10, 2008 14 / 37 Measures of Variability Boxplots e.g. A recent article (“Indoor Radon and Childhood Cancer”) presented the accompanying data on radon concentration (Bq/m2 ) in two different samples of houses. The first sample consisted of houses in which a child diagnosed with cancer had been residing. Houses in the second sample had no recorded cases of childhood cancer. The following graph presents a stem-and-leaf display of the data. 2. No cancer 1. Cancer 9683795 86071815066815233150 12302731 8349 5 7 Liang Zhang (UofU) 0 1 2 3 4 5 6 7 8 95768397678993 12271713114 99494191 839 55 5 Stem: Tens digit Leaf: Ones digit Applied Statistics I June 10, 2008 14 / 37 Measures of Variability The boxplot for the 1st data set is: Liang Zhang (UofU) Applied Statistics I June 10, 2008 15 / 37 Measures of Variability The boxplot for the 2nd data set is: Liang Zhang (UofU) Applied Statistics I June 10, 2008 16 / 37 Measures of Variability We can also make the boxplot for both data sets: Liang Zhang (UofU) Applied Statistics I June 10, 2008 17 / 37 Measures of Variability Some terminology: Lower Fourth: the median of the smallest half Liang Zhang (UofU) Applied Statistics I June 10, 2008 18 / 37 Measures of Variability Some terminology: Lower Fourth: the median of the smallest half Upper Fourth: the median of the largest half Liang Zhang (UofU) Applied Statistics I June 10, 2008 18 / 37 Measures of Variability Some terminology: Lower Fourth: the median of the smallest half Upper Fourth: the median of the largest half Fourth spread: the difference between lower fourth and upper fourth fs = upper fourth − lower fourth Liang Zhang (UofU) Applied Statistics I June 10, 2008 18 / 37 Measures of Variability Some terminology: Lower Fourth: the median of the smallest half Upper Fourth: the median of the largest half Fourth spread: the difference between lower fourth and upper fourth fs = upper fourth − lower fourth Outlier: any observation farther than 1.5fs from the closest fourth Liang Zhang (UofU) Applied Statistics I June 10, 2008 18 / 37 Measures of Variability Some terminology: Lower Fourth: the median of the smallest half Upper Fourth: the median of the largest half Fourth spread: the difference between lower fourth and upper fourth fs = upper fourth − lower fourth Outlier: any observation farther than 1.5fs from the closest fourth An outlier is extreme if it is more than 3fs from the nearest fourth, and it is mild otherwise. Liang Zhang (UofU) Applied Statistics I June 10, 2008 18 / 37 Measures of Variability The boxplot for the 2nd data set is: Liang Zhang (UofU) Applied Statistics I June 10, 2008 19 / 37 Sample Spaces and Events Basic Concepts in Probability: Liang Zhang (UofU) Applied Statistics I June 10, 2008 20 / 37 Sample Spaces and Events Basic Concepts in Probability: Experiment: any action or process whose outcome is subject to uncertainty Liang Zhang (UofU) Applied Statistics I June 10, 2008 20 / 37 Sample Spaces and Events Basic Concepts in Probability: Experiment: any action or process whose outcome is subject to uncertainty e.g. tossing a coin 3 times, testing the pH value of some reagent, counting the number of customers visiting a store in one day, etc. Liang Zhang (UofU) Applied Statistics I June 10, 2008 20 / 37 Sample Spaces and Events Basic Concepts in Probability: Experiment: any action or process whose outcome is subject to uncertainty e.g. tossing a coin 3 times, testing the pH value of some reagent, counting the number of customers visiting a store in one day, etc. Sample Space: the set of all possible outcomes of an experiment, usually denoted by S Liang Zhang (UofU) Applied Statistics I June 10, 2008 20 / 37 Sample Spaces and Events Basic Concepts in Probability: Experiment: any action or process whose outcome is subject to uncertainty e.g. tossing a coin 3 times, testing the pH value of some reagent, counting the number of customers visiting a store in one day, etc. Sample Space: the set of all possible outcomes of an experiment, usually denoted by S e.g. for the above 3 examples, the sample spaces are {TTT, TTH, THH, THT, HHH, HHT, HTH, HTT}, [0,14] and {0, 1, 2, . . . , N, . . . }, respectively. Liang Zhang (UofU) Applied Statistics I June 10, 2008 20 / 37 Sample Spaces and Events Basic Concepts in Probability: Experiment: any action or process whose outcome is subject to uncertainty e.g. tossing a coin 3 times, testing the pH value of some reagent, counting the number of customers visiting a store in one day, etc. Sample Space: the set of all possible outcomes of an experiment, usually denoted by S e.g. for the above 3 examples, the sample spaces are {TTT, TTH, THH, THT, HHH, HHT, HTH, HTT}, [0,14] and {0, 1, 2, . . . , N, . . . }, respectively. Liang Zhang (UofU) Applied Statistics I June 10, 2008 20 / 37 Sample Spaces and Events Basic Concepts in Probability: Liang Zhang (UofU) Applied Statistics I June 10, 2008 21 / 37 Sample Spaces and Events Basic Concepts in Probability: Event: any colletcion (subset) of outcomes contained in the sample space S. Liang Zhang (UofU) Applied Statistics I June 10, 2008 21 / 37 Sample Spaces and Events Basic Concepts in Probability: Event: any colletcion (subset) of outcomes contained in the sample space S. An event is simle if it consists of exactly one outcome and compound if it consists of more than one outcome. Liang Zhang (UofU) Applied Statistics I June 10, 2008 21 / 37 Sample Spaces and Events Basic Concepts in Probability: Event: any colletcion (subset) of outcomes contained in the sample space S. An event is simle if it consists of exactly one outcome and compound if it consists of more than one outcome. e.g. for the coin tossing example: {all the outcomes such that the first result is Head}, i.e. {HHT, HTH, HTT, HHH}, is an event and this is a compoud event; Liang Zhang (UofU) Applied Statistics I June 10, 2008 21 / 37 Sample Spaces and Events Basic Concepts in Probability: Event: any colletcion (subset) of outcomes contained in the sample space S. An event is simle if it consists of exactly one outcome and compound if it consists of more than one outcome. e.g. for the coin tossing example: {all the outcomes such that the first result is Head}, i.e. {HHT, HTH, HTT, HHH}, is an event and this is a compoud event; {all the outcomes which have 3 consecutive Head}, i.e. {HHH}, is also an event, while this is a single event. Liang Zhang (UofU) Applied Statistics I June 10, 2008 21 / 37 Sample Spaces and Events Examples: For the pH value testing example: {pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound; Liang Zhang (UofU) Applied Statistics I June 10, 2008 22 / 37 Sample Spaces and Events Examples: For the pH value testing example: {pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound; {pH value is between 2.0 and 3.0}, i.e. [2.0, 3.0], is another event, and it is also compound. Liang Zhang (UofU) Applied Statistics I June 10, 2008 22 / 37 Sample Spaces and Events Examples: For the pH value testing example: {pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound; {pH value is between 2.0 and 3.0}, i.e. [2.0, 3.0], is another event, and it is also compound. For the customers’ visiting investigation example: {the number of cumstomers visited in one day is less than 100}, i.e. {1, 2, 3, . . . , 98, 99}, is an event, and it is compound; Liang Zhang (UofU) Applied Statistics I June 10, 2008 22 / 37 Sample Spaces and Events Examples: For the pH value testing example: {pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound; {pH value is between 2.0 and 3.0}, i.e. [2.0, 3.0], is another event, and it is also compound. For the customers’ visiting investigation example: {the number of cumstomers visited in one day is less than 100}, i.e. {1, 2, 3, . . . , 98, 99}, is an event, and it is compound; {the number of cumstomers visited in one day is more than 200}, i.e. {201, 202, . . . } is also an event and it is compound. Liang Zhang (UofU) Applied Statistics I June 10, 2008 22 / 37 Sample Spaces and Events Another Coin Tossing Example: This time our experiment is tossing a coin many times until we get our first Head. Liang Zhang (UofU) Applied Statistics I June 10, 2008 23 / 37 Sample Spaces and Events Another Coin Tossing Example: This time our experiment is tossing a coin many times until we get our first Head. Then the sample space would be {1, 2, 3, 4, 5, . . . } and the corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }. Liang Zhang (UofU) Applied Statistics I June 10, 2008 23 / 37 Sample Spaces and Events Another Coin Tossing Example: This time our experiment is tossing a coin many times until we get our first Head. Then the sample space would be {1, 2, 3, 4, 5, . . . } and the corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }. Here are some events: Liang Zhang (UofU) Applied Statistics I June 10, 2008 23 / 37 Sample Spaces and Events Another Coin Tossing Example: This time our experiment is tossing a coin many times until we get our first Head. Then the sample space would be {1, 2, 3, 4, 5, . . . } and the corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }. Here are some events: {we tossed at most 3 times}, i.e. {1, 2, 3} or {H, TH, TTH}. Liang Zhang (UofU) Applied Statistics I June 10, 2008 23 / 37 Sample Spaces and Events Another Coin Tossing Example: This time our experiment is tossing a coin many times until we get our first Head. Then the sample space would be {1, 2, 3, 4, 5, . . . } and the corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }. Here are some events: {we tossed at most 3 times}, i.e. {1, 2, 3} or {H, TH, TTH}. {we totally tossed an even number of times}, i.e. {2, 4, 6, . . . } or {TH, TTTH, TTTTTH, · · · }. Liang Zhang (UofU) Applied Statistics I June 10, 2008 23 / 37 Sample Spaces and Events Another Coin Tossing Example: This time our experiment is tossing a coin many times until we get our first Head. Then the sample space would be {1, 2, 3, 4, 5, . . . } and the corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }. Here are some events: {we tossed at most 3 times}, i.e. {1, 2, 3} or {H, TH, TTH}. {we totally tossed an even number of times}, i.e. {2, 4, 6, . . . } or {TH, TTTH, TTTTTH, · · · }. Both events are compound. Liang Zhang (UofU) Applied Statistics I June 10, 2008 23 / 37 Sample Spaces and Events Basic Set Theory Complement: the complement of an event A denoted by A’ is the set of all outcomes in S that are not contained in A. Liang Zhang (UofU) Applied Statistics I June 10, 2008 24 / 37 Sample Spaces and Events Basic Set Theory Complement: the complement of an event A denoted by A’ is the set of all outcomes in S that are not contained in A. e.g. for our first coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, then A’ = {the first outcome is not Head, i.e. Tail} = {TTT, TTH, THT, THH} Liang Zhang (UofU) Applied Statistics I June 10, 2008 24 / 37 Sample Spaces and Events Basic Set Theory Complement: the complement of an event A denoted by A’ is the set of all outcomes in S that are not contained in A. e.g. for our first coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, then A’ = {the first outcome is not Head, i.e. Tail} = {TTT, TTH, THT, THH} for the pH value testing example, if A = {the pH value of the reagent is below 7.0}, then A’ = {the the pH value of the reagent is above 7.0} Liang Zhang (UofU) Applied Statistics I June 10, 2008 24 / 37 Sample Spaces and Events Basic Set Theory Complement: the complement of an event A denoted by A’ is the set of all outcomes in S that are not contained in A. e.g. for our first coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, then A’ = {the first outcome is not Head, i.e. Tail} = {TTT, TTH, THT, THH} for the pH value testing example, if A = {the pH value of the reagent is below 7.0}, then A’ = {the the pH value of the reagent is above 7.0} Liang Zhang (UofU) Applied Statistics I June 10, 2008 24 / 37 Sample Spaces and Events Basic Set Theory Union: the union of two events A and B, is the event consisting of all outcomes that are eigther in A or in B or in both events — that is, all outcomes in at least one of the events, denoted by A∪B Liang Zhang (UofU) Applied Statistics I June 10, 2008 25 / 37 Sample Spaces and Events Basic Set Theory Union: the union of two events A and B, is the event consisting of all outcomes that are eigther in A or in B or in both events — that is, all outcomes in at least one of the events, denoted by A∪B e.g. for the coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then A ∪ B = {the first or the last outcomem is Head} = {HHH, HHT , HTH, HTT , TTH, THH} Liang Zhang (UofU) Applied Statistics I June 10, 2008 25 / 37 Sample Spaces and Events Basic Set Theory Union: the union of two events A and B, is the event consisting of all outcomes that are eigther in A or in B or in both events — that is, all outcomes in at least one of the events, denoted by A∪B e.g. for the coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then A ∪ B = {the first or the last outcomem is Head} = {HHH, HHT , HTH, HTT , TTH, THH} Liang Zhang (UofU) Applied Statistics I June 10, 2008 25 / 37 Sample Spaces and Events Basic Set Theory Intersection: the intersection of two events A and B, is the event consisting of all outcomes that are both in A and in B, denoted by A∩B Liang Zhang (UofU) Applied Statistics I June 10, 2008 26 / 37 Sample Spaces and Events Basic Set Theory Intersection: the intersection of two events A and B, is the event consisting of all outcomes that are both in A and in B, denoted by A∩B e.g. for the coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then A ∩ B = {the first and the last outcomem is Head} = {HHH, HTH} Liang Zhang (UofU) Applied Statistics I June 10, 2008 26 / 37 Sample Spaces and Events Basic Set Theory Intersection: the intersection of two events A and B, is the event consisting of all outcomes that are both in A and in B, denoted by A∩B e.g. for the coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then A ∩ B = {the first and the last outcomem is Head} = {HHH, HTH} Liang Zhang (UofU) Applied Statistics I June 10, 2008 26 / 37 Sample Spaces and Events Basic Set Theory Null Event: the event consistion of no outcomes, denoted by ∅ Liang Zhang (UofU) Applied Statistics I June 10, 2008 27 / 37 Sample Spaces and Events Basic Set Theory Null Event: the event consistion of no outcomes, denoted by ∅ e.g. the event {the first outcome is neither Head nor Tail} for the coin tossing experiment is a null event. Liang Zhang (UofU) Applied Statistics I June 10, 2008 27 / 37 Sample Spaces and Events Basic Set Theory Null Event: the event consistion of no outcomes, denoted by ∅ e.g. the event {the first outcome is neither Head nor Tail} for the coin tossing experiment is a null event. Mutually Exclusive: if two events A and B satisfy A∩B = ∅, then A and B are said to be mutually exclusive or mutually disjoint. Liang Zhang (UofU) Applied Statistics I June 10, 2008 27 / 37 Sample Spaces and Events Basic Set Theory Null Event: the event consistion of no outcomes, denoted by ∅ e.g. the event {the first outcome is neither Head nor Tail} for the coin tossing experiment is a null event. Mutually Exclusive: if two events A and B satisfy A∩B = ∅, then A and B are said to be mutually exclusive or mutually disjoint. e.g. for the coin tossing example, if A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and B = {the first outcome is Tail} = {THH, TTH, TTT, THT}, then A ∩ B = {the first outcomem is Head and Tail} =∅ So A and B are mutually disjoint. Liang Zhang (UofU) Applied Statistics I June 10, 2008 27 / 37 Sample Spaces and Events Remark: 1. The union and intersection operation can be extended to more than two events. Liang Zhang (UofU) Applied Statistics I June 10, 2008 28 / 37 Sample Spaces and Events Remark: 1. The union and intersection operation can be extended to more than two events. e.g. for any three events A, B and C, the event A ∪ B ∪ C is the set of all outcomes contained in at least one of the three events; Similarly, A ∩ B ∩ C is the set of all outcomes contained in all three events. Liang Zhang (UofU) Applied Statistics I June 10, 2008 28 / 37 Sample Spaces and Events Remark: 1. The union and intersection operation can be extended to more than two events. e.g. for any three events A, B and C, the event A ∪ B ∪ C is the set of all outcomes contained in at least one of the three events; Similarly, A ∩ B ∩ C is the set of all outcomes contained in all three events. 2. Given n events A1 , A2 , . . . , An . They are said to be mutually disjoint or pairwise disjoint, if any two events are mutually disjoint. Liang Zhang (UofU) Applied Statistics I June 10, 2008 28 / 37 Sample Spaces and Events Venn Diagrams: Liang Zhang (UofU) Applied Statistics I June 10, 2008 29 / 37 Sample Spaces and Events Venn Diagrams: e.g. A∪B Liang Zhang (UofU) A∩B Applied Statistics I June 10, 2008 29 / 37 Sample Spaces and Events Venn Diagrams: e.g. A∪B A∩B mutually disjoint A complement Liang Zhang (UofU) Applied Statistics I June 10, 2008 29 / 37 Axiomatic Probability The objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance thtat A will occur. Liang Zhang (UofU) Applied Statistics I June 10, 2008 30 / 37 Axiomatic Probability The objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance thtat A will occur. Probability Axioms: Liang Zhang (UofU) Applied Statistics I June 10, 2008 30 / 37 Axiomatic Probability The objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance thtat A will occur. Probability Axioms: AXIOM 1 For any event A, P(A) ≥ 0. Liang Zhang (UofU) Applied Statistics I June 10, 2008 30 / 37 Axiomatic Probability The objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance thtat A will occur. Probability Axioms: AXIOM 1 For any event A, P(A) ≥ 0. AXIOM 2 P(S) = 1. Liang Zhang (UofU) Applied Statistics I June 10, 2008 30 / 37 Axiomatic Probability The objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance thtat A will occur. Probability Axioms: AXIOM 1 For any event A, P(A) ≥ 0. AXIOM 2 P(S) = 1. AXIOM 3 If A1 , A2 , A3 , . . . is an infinite collection of disjoint events, P∞ then P(A1 ∪ A2 ∪ A3 ∪ · · · ) = i=1 P(Ai ) Liang Zhang (UofU) Applied Statistics I June 10, 2008 30 / 37 Axiomatic Probability Proposition P(∅) = 0 where ∅ is the null event. This in turn implies that the property contained in Axiom 3 is valid for finite collection of events, i.e. if A1 , A2 , . . . , An is a finite collection of disjoint events, then Pn P(A1 ∪ A2 ∪ · · · ∪ A3 ) = i=1 P(Ai ) Liang Zhang (UofU) Applied Statistics I June 10, 2008 31 / 37 Axiomatic Probability Examples: 1. Consider the coin tossing experiment and we are only interested in tossing the coin one time. Then S = {H, T}. Liang Zhang (UofU) Applied Statistics I June 10, 2008 32 / 37 Axiomatic Probability Examples: 1. Consider the coin tossing experiment and we are only interested in tossing the coin one time. Then S = {H, T}. Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually disjoint, by Axiom 3, we have Liang Zhang (UofU) Applied Statistics I June 10, 2008 32 / 37 Axiomatic Probability Examples: 1. Consider the coin tossing experiment and we are only interested in tossing the coin one time. Then S = {H, T}. Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually disjoint, by Axiom 3, we have P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1 Liang Zhang (UofU) Applied Statistics I June 10, 2008 32 / 37 Axiomatic Probability Examples: 1. Consider the coin tossing experiment and we are only interested in tossing the coin one time. Then S = {H, T}. Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually disjoint, by Axiom 3, we have P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1 If the coin is fair, we should assign 0.5 to P({H}) and 0.5 to P({T }). Liang Zhang (UofU) Applied Statistics I June 10, 2008 32 / 37 Axiomatic Probability Examples: 1. Consider the coin tossing experiment and we are only interested in tossing the coin one time. Then S = {H, T}. Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually disjoint, by Axiom 3, we have P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1 If the coin is fair, we should assign 0.5 to P({H}) and 0.5 to P({T }). If the coin is more likely to give a Head, then 0.8 for P({H}) and 0.2 for P({T }) may be suitable. Liang Zhang (UofU) Applied Statistics I June 10, 2008 32 / 37 Axiomatic Probability Examples: 1. Consider the coin tossing experiment and we are only interested in tossing the coin one time. Then S = {H, T}. Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually disjoint, by Axiom 3, we have P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1 If the coin is fair, we should assign 0.5 to P({H}) and 0.5 to P({T }). If the coin is more likely to give a Head, then 0.8 for P({H}) and 0.2 for P({T }) may be suitable. In fact, if p is any fixed number between 0 and 1, then P({H}) = p , and P({T }) = 1 − p is an assignment consistent with the axioms. Liang Zhang (UofU) Applied Statistics I June 10, 2008 32 / 37 Axiomatic Probability Examples: 2. Consider again the coin tossing example. However, this time we are interested in getting a Head, i.e. we toss a coin many times untill we get a Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }. Liang Zhang (UofU) Applied Statistics I June 10, 2008 33 / 37 Axiomatic Probability Examples: 2. Consider again the coin tossing example. However, this time we are interested in getting a Head, i.e. we toss a coin many times untill we get a Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }. If P({H}) = 0.4 then P({T }) = 0.6, P({TH}) = (0.4)0.6, P({TTH}) = (0.4)(0.6)2 , P({TTTH}) = (0.4)(0.6)3 , . . . . Liang Zhang (UofU) Applied Statistics I June 10, 2008 33 / 37 Axiomatic Probability Examples: 2. Consider again the coin tossing example. However, this time we are interested in getting a Head, i.e. we toss a coin many times untill we get a Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }. If P({H}) = 0.4 then P({T }) = 0.6, P({TH}) = (0.4)0.6, P({TTH}) = (0.4)(0.6)2 , P({TTTH}) = (0.4)(0.6)3 , . . . . Since {H}, {TH}, {TTH}, {TTTH}, {TTTTH}, . . . are mutually disjoint and S = {H} ∪ {TH} ∪ {TTH} ∪ {TTTH} ∪ {TTTTH} ∪ . . . , we have Liang Zhang (UofU) Applied Statistics I June 10, 2008 33 / 37 Axiomatic Probability Examples: 2. Consider again the coin tossing example. However, this time we are interested in getting a Head, i.e. we toss a coin many times untill we get a Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }. If P({H}) = 0.4 then P({T }) = 0.6, P({TH}) = (0.4)0.6, P({TTH}) = (0.4)(0.6)2 , P({TTTH}) = (0.4)(0.6)3 , . . . . Since {H}, {TH}, {TTH}, {TTTH}, {TTTTH}, . . . are mutually disjoint and S = {H} ∪ {TH} ∪ {TTH} ∪ {TTTH} ∪ {TTTTH} ∪ . . . , we have 1 = 0.4 + (0.4)(0.6) + (0.4)(0.6)2 + (0.4)(0.6)3 + · · · Liang Zhang (UofU) Applied Statistics I June 10, 2008 33 / 37 Axiomatic Probability More Probability Properties Liang Zhang (UofU) Applied Statistics I June 10, 2008 34 / 37 Axiomatic Probability More Probability Properties Proposition For any event A, P(A) + P(A0 ) = 1, from which P(A) = 1 − P(A0 ). Liang Zhang (UofU) Applied Statistics I June 10, 2008 34 / 37 Axiomatic Probability More Probability Properties Proposition For any event A, P(A) + P(A0 ) = 1, from which P(A) = 1 − P(A0 ). Example 2.13 Consider a system of five identical components connected in series, as illustrated below. Denote a component failure by F and success by S. Let A be the event that the system fails. For A to occur, at least one of the individual components must fail. If we know P({F }) = 0.1, then what is P(A)? Liang Zhang (UofU) Applied Statistics I June 10, 2008 34 / 37 Axiomatic Probability Proposition For any event A, P(A) ≤ 1 . Liang Zhang (UofU) Applied Statistics I June 10, 2008 35 / 37 Axiomatic Probability Proposition For any event A, P(A) ≤ 1 . Proposition For any two events A and B, P(A ∪ B) = P(A) + P(B) − P(A ∩ B) Liang Zhang (UofU) Applied Statistics I June 10, 2008 35 / 37 Axiomatic Probability Proposition For any event A, P(A) ≤ 1 . Proposition For any two events A and B, P(A ∪ B) = P(A) + P(B) − P(A ∩ B) A Venn Diagram proof: Liang Zhang (UofU) Applied Statistics I June 10, 2008 35 / 37 Axiomatic Probability Proposition For any event A, P(A) ≤ 1 . Proposition For any two events A and B, P(A ∪ B) = P(A) + P(B) − P(A ∩ B) A Venn Diagram proof: = Liang Zhang (UofU) + Applied Statistics I June 10, 2008 35 / 37 Axiomatic Probability Example 2.14 In a certain residential suburb, 60% of all households subscribe to the metropolitan newspaper published in a nearby city, 80% subscribe to the local paper, and 50% of all households subscribe to both papers. If a househlld is selected at random, what is the probability that it subscribes to (1)at least one of the two newspapers and (2) exactly one of the two newspapers? Liang Zhang (UofU) Applied Statistics I June 10, 2008 36 / 37 Axiomatic Probability Proposition For any three events A, B, and C , P(A ∪ B ∪ C ) =P(A) + P(B) + P(C ) − P(A ∩ B) − P(B ∩ C ) − P(C ∩ A) + P(A ∩ B ∩ C ) A Venn Diagram interpretation: Liang Zhang (UofU) Applied Statistics I June 10, 2008 37 / 37