Random Variables: Definition, Distribution, Expectation, Variance

CHAPTER 1 RANDOM VARIABLE INTRODUCTION Random Engineering quantities whose variations contain elements of chance are called random variables. E.g. of engineering quantities are Weight, Force, Resistance, Length. SOME EXAMPLES OF ENGINEERING RANDOM VARIABLES 1. The diameter of a motor shaft with a nominal size 0.2m. 2. The weight of a steel box used to contain engineering tools. 3. The number of components passing a point on a factory’s production line in one minute. 4. The Length of time a machine works without failing. 5. The nominal resistance value of resistors. 6. The length of a bridge. 7. The force required to stretch a specific length of metal. 8. The number of bits of a computer. CLASSIFICATION OF RANDOM VARIABLES Random variables can be classified into two; Discrete and Continuous Random Variables. All the quantities mentioned in Section 1.2 vary. In 1, 2 and 4, the values vary continuously. That is they can assume any value within some range. For example, a diameter shaft may have any diameter between 0.197 and 0.203. A steel box can have any weight between 0.345kg and 0.352kg. These are examples of continuous variables. The value may be measured to a certain accuracy which depends on the measuring device which may be used. The shaft diameter may be measured to the nearest tenth of a millimeter. In 3 and 5, the variables can assume only a limited number of values. For example, the number of components passing a point will be a nonnegative integer 0, 1, 2, . . . The nominal resistance value 1 of a resistor has a limited number of values which is specified by the manufacturer in its catalogue. Variables such as these which can assume only a limited set of values are called discrete variables. Definition 1 (Random Variables) Let S be the sample space associated with some experiment  . A random variable X is a function that assigns a real number S (s) to each sample point s  S . Consider an experiment of tossing two fair coins simultaneously. Let the number of tails that show up be defined by the random variable X. The sample points are therefore defined by the values of the random variable as below: Sample Points HH HT TH TT x 0 1 1 2 The random variable X which defines the number of tails for the experiment has the values 0, 1, and 2. PROBABILITY DISTRIBUTIONS Discrete Random Variables A probability distribution of a discrete random variable X is a sequence of the values xi (i=1, 2, . . .) of X, together with a probability assigned to each point xi ; i=1, 2, . . . The value of xi may be either finite or countably infinite and in any order, though for convenience it should be in increasing order of magnitude. The probability distribution of a discrete random variable is more often called a probability function (pf), or a probability mass function (pmf), and is denoted by p( xi ) or P( X  xi ) . It is a probability that the random variable X assumes a certain value. Representation of probability distribution of X. 2 xi x1 x2 x3 … … xn p( xi ) p( x1 ) p( x2 ) p ( x3 ) … … p( x n ) For a discrete random variable, the sum of the probabilities is 1  i.e  P( X  xi )  1 . i 1 Properties of Probability mass Function p( xi )  0 for all x 1.  2.  p( x )  1 where the sum is over all x. i i 1 Example 1 A discrete function is given by 1  (2 x  3), p( x)   21 0, x  1, 2, 3 otherwise Verify that it is a probability function of some variable. Solution Clearly p( xi )  0 and 1 3 1 1 ( 2 x  3)  [(2  3)  ( 4  3)  (6  3)]  (5  7  9)  1  21 x 1 21 21 Hence the function is a pmf. Example 2 A discrete random variable has a pmf k ( x  1), p( x)   0, x  3, 4, 5 otherwise 3 Find the constant k for which p(x0 is a probability function. Since the function is a pmf, 5 k  ( x  1) k [(3  1)  (4  1)  (5  1)] x 3  9k  1 k From which 1 9 Continuous Random Variables The probability distribution of a continuums random variable is more often called a probability density function (pdf), or simply density function and is denoted by f(x). Properties of Probability Density Function 1. 2. f ( x )  0 for all x    f ( x)dx  1 Example 3 Let X be a continuous random variable such that 1  x, f ( x)   8 0, 0 x4 elsewhere (a) Show that f(x) is a pdf. (b) Sketch the graph Solution (a) It is clear that f ( x )  0 so property 1 is satisfied. For f(x) to be pdf, it must also satisfy property 2. That is 4  4 0 4 1 1 1  xdx  x 2   (16)  1 8 16  0 16 Hence f(x) is a pdf. (b) Example 4 A random variable X has the pdf kx, f ( x)   0, 0x4 elsewhere where k is a constant. (a) Find the constant k. (b) ComputeP(2 <X<3). Solution (a) Since the function is pdf, 4 k k 2  0 kxdx  2 x  0  2 (16)  1 1 k  8 4 3 1 1 1 5  1 2 (9)  (4)  (b) P(2 <X<3)=  xdx   x   2 8 16 16 16  2 16 3 5 CUMMULATIVE DISTRIBUTION FUNCTIONS If X is a random variable and x any real number, the cumulative distribution function (cdf) of X is a function F defined as the probability that the random variable X takes a value less than or equal to x. i.e F ( x )  P( X  x) or F ( x)  P(  X  x) Discrete Random Variable Let X be a discrete random variable with probability p(x), then the cumulative distribution function is defined as F ( x)   p ( x) xi  x 0,  p( x )  1  p ( x1 )  p ( x 2 )  i.e F ( x)  . .  .  p ( x )  p ( x )  ...  p ( x )  1 2 n  1    x  x1 x1  x  x 2 x 2  x  x3 x n  x   Continuous Random Variable Let X be a continuous random variable with probability f(x), then the cumulative distribution function is defined as x F ( x)   f (t )dt  i.e If we define f(x) over a  x  b , then 0,  x F ( x )   f (t )dt a  1 xa a xb xb 6 EXPECTATION OF X Discrete Random Variables The expectation of X (expected value or mean), written E(X), is given by   E( X )   xi P( X  xi ) or E( X )   xi p( xii ) i 1 i 1 This can simply be expressed in the form ( )= = 1,2, … , = ( ) Continuous Random Variables The expectation of X (expected value or mean), written E(X), is given by  E ( X )   xf ( x)dx  VARIANCE OF X For a discrete or continuous random variable, X, with follows: = ( ), the variance is defined as The variance of X is written as Var(X) which is given by ( )= ( − ) ( ) = ( − ) = ( −2 + = ( )−2 + = ( )− ( )= ( )− ) 7 We may write Var ( X )  E ( X 2 )  [ E ( X )]2 = ( )= Example 5 X is the random variable ‘the number on a biased die’, and the p.d.f of X is as shown. x ( = ) 1 1 6 2 1 6 Find (a) the value of y, (b) E(X) (c) ( e. P(X = 1) f. P(X > 2) 3 1 5 4 5 1 5 6 1 6 ) (d) Var(X) g. P(X ≥ 3) h. (1 < <5) Solution   p( x )  1 ; I = 1, 2, …,6 a. i i 1 1 1 1 + + + 6 6 5 1 = 10 X ( = ) 1 1 + + =1 5 6 1 2 3 4 5 6 1 6 1 6 1 5 1 10 1 5 1 6 b. ( ) = ∑ 1 1 1 1 1 1 1 +2 +3 +4 +5 +6 = 3.5 6 6 5 10 5 6 c. ( )=1 +2 +3 +4 +5 8 +6 = 15 d. Var ( X )  E ( X 2 )  [ E ( X )]2 = 15 7 − 3.5 30 e. P(X = 1) = f. P(X > 2) = P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6) = 1 1 1 1 2 + + + = 5 10 5 6 3 g. P(X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6) 1 1 1 1 = + + + 5 10 5 6 h. (1 < < 5 ) = P(X = 2) + P(X = 3) + P(X = 4) 1 1 1 7 = + + = 6 5 10 15 Example 6 A discrete engineering random variable has a pmf  1 ( x  2) P ( X  x)   3 0, x  2, 3, 4, otherwise Calculate the expectation and variance of X. Solution 4 4 E ( X )   xP( X  x)   x 13 ( x  2)  x 2 x 2 4 1 3  (x x2 2 1  2 x)  ](2 2  2(2))  (3 2  2(3))  (4 2  2(4)] 3  3 23 Please try to find the Variance of X. 9 Example 7 Refer to Example 4. Find E(X) and Var(X). Solution 4  4  0 E ( X )   xf ( x)dx   1 1 4 1  x3  1  64 0  8 x xdx   x 2 dx         8 8 0 8  3 0 8  3 3 3 Var( X )  E ( X 2 )  [ E ( X )]2 , so we need to find E ( X 2 ) . 4  2 2 E ( X )   x f ( x)dx    4 0 1 1 4 1  x4  1  256 0  x xdx   x 3 dx        8 8 8 0 8  4 0 8  4 4 2 2 8 8 Therefore Var( X )  E( X )  [ E ( X )]  8     9 3 2 2 Properties Expectations and Variance of independent random variables 1. 2. 3. 4. 5. 6. ( ( ( ( ( ( ± ) = ( )± ( ) ± )= ( )± ) ( )+ ± = )=0 )= ( ) + )= ( ) ( ) Example 8 Independent random variables X and Y are such that E(X)= 4 , E(Y)= 5 , Var(X)= 1, Var(Y)= 2. Find a. E(4X + 2Y) b. Var(3X + 2Y) Solution a. (4 + 2 ) = 4 ( ) + 2 ( ) = 26 10 (3 b. + 2 )=3 ( )+2 ( ) = 17 SOME DISCRETE PROBABILITY DISTRIBUTION Bernoulli process ( )= Mean = ( )= (1 − ) = , =0 1 0< < 1; =1− ( ) = (1 − ) = The random variable of this experiment is a binary variable which assumes the value;0 and 1. If =0⟹ ( = 0) = (1 − ) = ( = 1) = (1 − ) = If x=1 ⟹ Binomial distribution Binomial experiment is the generalization of the Bernoulli trials. Conditions for a binomial distribution 1. A finite number, n trials are carried out. 2. The trials are independent and identical. An independent event is an event in which the occurrence of one does not affect the other. e.g.; tossing a coin. 3. The outcome of each trial is either a success or a failure. 4. The probability of a successful outcome is the same for 11 each trial. ~ ( , ) ~ ( , ) Note: The number of trials, and the probability of success, are both needed to describe the distribution completely. They are known as the parameters of the binomial distribution. If ~ ( , ), the probability of obtaining r success in n trials is ( = ) where ( = )= = ! ! ( − )! = 0,1,2,3, … , ; ! = ( − 1)( − 2)( − 3) … 3 ∙ 2 ∙ 1 n = number of trials. p = probability of success on a single trial r = number of success in n trials. Mean; μ = np Variance; σ = npq standard deviation; σ = npq Example 9 1. The random variable X is distributed (7,0.2). Find, correct to 3 decimal places a. P(X=3) b. (1 < ≤ 4) c. P(X>1) Solution 12 P=0.2 q=0.8 n=7 a. ( = 3) = = 0.115 b. (1 < ≤ 4) = ( = 2) + ( = 3) + ( = 4) = 7 2 + 7 3 + 7 4 = 0.419 c. ( > 1) = 1 − ( ≤ 1) = ( = 2) + ( = 3) + ⋯ + ( = 7) = 1 − [ ( = 0) + ( = 1)] =1−( + 7 1 ) = 0.423 Example 10 A box contains a large number of spare parts. The probability that a spare part is faulty is 0.1. How many of the parts would you need to select to be more than 95% certain of picking at least one faulty one? Solution P=0.1 q=0.9 n=? ( ≥ 1) > 0.95 ( ≥ 1) = 1 − ( < 1) = 1 − ( = 0) 13 =1− = 1 − 0.9 1 − 0.9 > 0.95 0.05 > 0.9 0.9 < 0.05 Take log of both sides. nlog0.9 < log0.05 n> 28.4 n≈ 29 You need to select at least 29 pens. Example 11 Suppose that a consignment of 300 electrical fuses contains 5% defectives. If a random sample of ten fuses is selected and tested, find the probability of observing at least three defectives. Solution X~B(10,0.05) ( ≥ 3) = 1 − ( < 3) 1 − ( ≤ 2) = 1 − [ ( = 0) + ( = 1) + ( = 2)] = 1 − 0.95 + 10 10 ∙ 0.95 ∙ 0.05 + ∙ 0.95 ∙ 0.05 1 2 = 0.0115 Example 12 14 30% of pupils in a school travel to school by bus. From a sample of 10 pupils chosen at random, find the probability that; a. Only three travel by bus. b. Less than half travel by bus. Solution P=0.3 q=0.7 n=10 a. ~ (10,0.3) ( = 3) = = 0.267 b. ( < 5) = ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4) + 10 1 + 10 2 + 10 3 + 10 4 = 0.850 Example 13 In a survey on washing powder, it is found that the probability that a shopper chooses soapysuds is 0.25. Find the probability that in a random sample on 9 shoppers. a. Exactly 3 choose soapysuds. b. More than 7 choose soapysuds. Solution P=0.25 q=0.75 n=9 a. ( = 3) = = 0.234 b. ( > 7) = ( = 8) + ( = 9) 15 = 9 8 + = 0.000107 Example 14 A bag contains screw drivers of which 40% have red handles and the rest yellow. A screw driver is taken from the bag, its handle color noted and then replaced. This is performed eight times in all. Calculate the probability that; a. exactly three will be red. b. at least one will be red. c. more than four will be yellow. Solution n=8 p=0.4 q=0.6 a. ( = 3) = = 0.279 b. ( ≥ 1) = 1 − ( < 1) = 1 − ( = 0) =1− = = 0.983 c. n=8 p=0.6 q=0.4 ( > 4) = 1 − ( ≤ 4) = 1 − ( < 5) = ( = 5) + ( = 6) + ( = 7) + ( = 8) 16 = 8 5 + 8 6 + 8 7 + = 0.594 Example 15 (4, ) ( = 4) = 0.0256 Find P(X=2) Solution ( = 4) = 0.0256 = 4 4 4 = 0.0256 = = 0.4 ⇒ ( = 2) = 4 0.6 0.4 2 Example 16 A multiple choice test contains 25 questions, each with four answers. Assume a student just guesses on each question. a. What is the probability that the student answers more than 20 questions correctly? b. What is the probability that the student answers less than 5 questions correctly? 17 Solution n=25 p=0.25 q=0.75 a. ( > 20) = ( = 21) + ( = 22) + ( = 23) + ( = 24) + ( = 25) = 25 21 25 22 + 9 + 25 23 25 24 + + =0 Example 17 Suppose having an experiment of two outcomes. The probability of success is and probability of failure is . What is the probability of having one success obtained in two trials? Solution Outcome SS SF FS FF [ Probability 1/9 2/9 2/9 4/9 ]= [ 1 But using binomial theorem From the example = 2, ]=2 9+2 9=4 9 = success, where = , , = failure and = 1 3 2 3 =2× Example 18 18 1 2 4 × = 3 3 9 + =1 In a game of chance, you play by rolling a fair die four times and you count the number of results which are 6s. i. ii. What is the probability that in one play of the game you obtain exactly three 6s? What is the probability that in one of the game you obtain exactly two 5s solution i. [ = 3] = = ii. [ = 2] = = Example 19 Of the telephone calls received by an airline reservation agent,60% requests for information and 40%are to make reservations. Assume the calls can be viewed as Bernoulli trials with success defines to be call for a reservation. Six calls were reserved. 1. What is the probability that exactly 2 calls are for reservation? 2. What is the probability that at least 4 are for information? Solution n=6 p=0.4 1. ( = 2) = n=6 p=0.6 q=0.6 (0.4) (0.6) = q=0.4 2. ( ≥ 4) = ( = 4) + ( = 5) + ( = 6) = (0.6) (0.4) + (0.6) (0.4) + 19 (0.6) (0.4) = 0.1792 Example 20 The probability that it will be a fine day is 0.4. Find the expected number of fine days in week and also the standard deviation. Solution The expected number of five days = ( ) = Standard deviation of = = 7 × 0.4 = 2.8 ( ) =√7 × 0.4 × 0.6 = 1.3 days Question 1 A used car sales woman estimates that each times she shows a customer a car, there is a probability 0.1 that the customer will buy the car. The sales woman would like to sell at least one car per week. If showing a car is a Bernoulli trial how many cars would the saleswoman show per week so that the probability is 0.95 of at least one sale? Ans 29 Question 2 The random variable X is ( , 0.3) and ( ) =2.4 find n and standard deviation of x? Ans n= 8, s.d=1.3 Question 3 In a group of people the expected number who wear glasses is 2 and the variance is 1.6 find the probability that a. A person chosen at random from the group wear glasses. b. 6 people in the group wear glasses. 20 Poisson distribution Conditions for a Poisson model 1. Events occur singly and at random in a given interval of time or space. 2. The parameter ; > 0 is the mean number of occurrences in the given interval, is known and is finite(i.e the occurrence rate per unit). The variable X is the number of occurrences in the given interval. ~ ( = )= ! ( ) = = = 0,1,2,3, … ∞ Typical examples of random variables for which the Poisson probability distribution provides a good model are. 1. The number of traffic accidents per month at a busy intersection. 2. The number of death claims received per day by an insurance company. 3. The number of unscheduled admissions per day to a hospital. Poisson distribution is used to model the occurrence of a random event that happens in some time periods. Example 21 A student finds that the average number of amoebas in 10ml of pond water from a particular pond is 4. Assuming that the number of amoebas follow a Poisson distribution, find the probability that in a 10ml sample a. There are exactly 5 amoebas 21 b. There are no amoebas. c. There are fewer than three amoebas. Solution X is the number of amoebas in 10ml if pond water, where ~ (4) = 4 a. ( = 5) = ! = 0.156 b. ( = 0) = ! = 0.0183 c. ( < 3) = 1 − ( ≥ 3) = ( = 0) + ( = 1) + ( = 2) = 4 + 0! 4 + 1! 4 2! = 0.238 Note Unit interval For this example, the mean number of amoebas in 10ml of pond water from a particular period is four so the number in 10ml is distributed (4). now suppose you want to find a probability relating to the number of amoebas in 5ml of water from the same pond. The mean number of amoebas in 5ml is two, so the number in 5ml is distributed (2). Example 22 22 On average the school photocopier breaks down eight times during the school week(Mon-Fri). Assuming that the number of breakdowns can be modeled by a Poisson distribution. Find the probability that it breaks down. a. Five times in a given week. b. Once on Monday. c. Eight times in a fortnight. Solution a. X is the number of breakdowns in a week, where X~ (8) ( = 5) = 8 = 0.0916 5! b. Let B be the number of breakdowns in a day. The mean number of breakdowns in a day is = 1.6 so ~ (1.6). . ( = 1) = 1.6 = 1.6 1! . = 0.323 c. Let Y be the number of breakdowns in a fortnight. The mean number of breakdowns in a fortnight is 16 so ~ (16) ( = 8) = 16 = 0.0120 8! Example 23 X follows a Poisson distribution with standard deviation 1.5. Find ( ≥ 3). 23 Solution If ~ ( ) ( )= ( )=( ) = 2.25 = 2.25 ~ (2.25) ( ≥ 3) = 1 − ( < 3) = 1 − [ ( = 0) + ( = 1) + ( = 2)] . =1− 2.25 + 0! . 2.25 + 1! . 2.25 2! = 0.391 Example 24 An insurance company receives on an average two claims per week from a particular factory. Assuming that the number of claims can be modeled by a Poisson distribution, find the probability that it receives a. 3 claims in a given week. b. More than four claims in a given week. c. Four claims in a given fortnight. d. No claims on a given day, assuming that the factory operates on a five-day week. Solution Let X be the number of claims per week, ~ (2) a. ( = 3) = ! = 0.180 24 b. ( > 4) = 1 − ( ≤ 4) = 1 − [ ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4)] =1− [1 + 2 + 2 2 2 + + ] 2! 3! 4! = 0.053 c. Let Y be the number of claims in a fortnight. The mean number of claims in a fortnight is 4 so ~ ( = 4) = (4) 4 = 0.195 4! d. let F be the number of claims in a given day. The mean number of claims in a day is 0.4. ~ (0.4) . ( = 0) = 0.4 = 0! . = 0.670 Example 25 A sales manager receives six telephone calls on average between 9:30am and 10:30am on a weekday. Find the probability that. a. She will receive two or more calls between 9:30am and 10:30am on Tuesday. 25 b. She will receive exactly two calls between 9:30am and 9:40am on Wednesday. Solution Let X be the number of telephone calls received by the manager between 9:30am-10:30am. ~ (6) ( ≥ 2) = 1 − ( < 2) a. = 1 − [ ( = 0) + ( = 1)] =1−[ 6 6 + ] 0! 1! [1 + 6] =1− = 0.983 b. Let Y be the number of telephone calls received by the manager between 9:30am to 9:40am on Wednesday. The mean number of calls received between 9:30am on Wednesday is 1. ~ (1). ( = 2) = 1 2! = 0.1839 Example 26 The number of bacterial colonies on a petri dish can be modeled by a Poisson distribution with average number 2.5 per . Find the probability that a. In 1 there are no bacterial colonies. b. In 2 there are more than four bacterial colonies. c. In 4 there are six bacterial colonies. Solution Let X be the number of bacterial colonies on a petri dish. ~ (2.5). 26 . ( = 0) = a. . ! = . = 0.082 b. Let Y be the number of bacteria on a 2 petri dish. The mean number is ~ (5). ( > 4) = 1 − ( ≤ 4) = 1 − [ ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4)] =1− 1+5+ 5 5 5 + + 2! 3! 4! = 0.559 c. Let B be the number of bacteria on 4 P(X = 6) = e petri dish. ~ (10) 10 6! = 0.063 Example 27 Customers walk into a store at an average rate of 20 per hour. Find the probability that. a. No customer have arrived at the store in 10min. b. No more than 4 customers have arrived at the store in 30min. Solution a. Let X be the number of customers arriving at the store in 30min with mean 10. ~ ( ). 27 ( = 0) = (10⁄3) = 0.0356 0! b. Let Y be the number of customers arriving at the store in 30min with mean 10. ~ ( ≤ 4) = ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4) = 1 + 10 + 10 10 10 + + 2! 3! 4! = 0.029 Example 28 The average number misprints on each page in the first draft of novel are four. Find the probability that on a randomly selected double page a. There are three misprints on each page. b. There are six misprints in total. Solution =4 4 [ ( = 3)] = [ 3! ] = 0.0382 b. There six misprints in total. =4×2 ( = 6) = 8 6! = 0.122 28 (10) Using the Poisson distribution as an approximation to the binomial distribution When n is large( > 50)and p is small ( < 0.1) the distribution ~ ( , )can be approximated using a Poisson distribution with same mean ie ~ ( ). The approximation gets better as n gets larger and p gets smaller. Example 29 Eggs are packed into boxes of 500. On average 0.7% of the eggs are to be broken when the eggs are unpacked. Find, correct to 2 significant figures the probability that in a box 500 of eggs, a. exactly three are broken b. at least two are broken Solution = 500 Since n > 50and Given ~ a. ( )= = 0.007 = 500 × 0.007 = 3.5 < 0.1, use a Poisson approximation. (3.5) ( = 3) = . . ! = 0.22 b. ( ≥ 2) = 1 − ( ( = 0) + ( = 1) =1−( . +3 . ) = 0.86 Example 30 A Christmas draw aims to sell 5000 tickets, 50 of which will win a prize. 29 a. Calculate P(X ≤ 3 ) b. Calculate how many tickets should be bought in order for there to be 90% probability of winning at least- one prize. Solution = 0.01 a. P(a ticket-wins a prize) = ~ (200 , 0.01) Ans= 0.86 (b) ~ ( , 0.01) > 50 < 0.1 ( ≥ 1 ) = 0.9 ( ≥ 1) = 1 − ( = 0 ) . 0.9 = 1 . (0.01 ) = 0.1 -0.01n = ln(0.1) = ( . ) . = 230.25 So the least integer value of n must be 231 = 230 = 230 × 0.01 = 2.3 1− . = 231 = 231 × 0.01 = 2.31 1− . = 0.8997 < 0.9 = 0.9007 > 0.9 Example 31 X is (250 , ). The value of p is such that it is valid to apply a Poisson approximation. When this is done, it is found that P(X = 0) = 0.0235. Find the value of p. Solution E(X)=np ( = 0) = (250 ) = 0.0235 0! 30 = 0.0235 = ln (0.0235) = 0.0150 −250 The sum of independent Poisson variables For independent variables, X and Y , if ~ ( ) ~ ( ) then + ~ ( + ) Example 32 Two identical racing cars are being tested on a circuit. For each, the number of mechanical breakdowns can be modeled by a Poisson distribution with a mean of one breakdown in 100 lags. If a car breaks down it is attended and continues on the circuit. The first car is tested for 20 lags and second car for 40 lags. Find The probability that the services team is called out to attend to the breakdowns a. Once b. more than twice Solution ~ (0.2) ~ (0.4) = + a. ( = 1) = 0.6 . b. ( > 2) = 1 − ( = 0) + , (0.6) = 0.329 ( = 1) + = 0.023 SOME CONTINUOUS DISTRIBUTION 31 ( = 2) UNIFORM DISTRIBUTION The continuous random variable X is said to have the uniform distribution over the interval [a, b] if the probability density function satisfies f ( x)  1 'axb ba Question 1 Suppose that buses arrive at bus stop every 15mins and that the waiting time for the bus to arrive has a uniform probability distribution on the interval from 0 to 15mins a. What is the probability that X will exceed 10mins? b. What is the probability that X will be at most 12mins? Question 2 The probability density function of time X required to complete an assembly operation is uniformly distributed for 30 ≤ ≤ 40sec. Determine the proportion assemblies that require less than 35sec to complete. EXPONENTIAL DISTRIBUTION The exponential distribution or more precisely the negative exponential distribution is used to model the time required to observe the first occurrence of an object of a specified type when events of this type are occurring randomly at mean rate  per unit time. Example of such phenomena is the time until a piece of equipment fails, the time it takes to complete a job etc. The continuous random variable X is said to have the exponential distribution with positive parameter  if its probability density function is given by 32  e x , x  0 f ( x)   , elsewhere 0 Example 33 Suppose that the time until first failure of an electric component is an exponential random variable with a rate of   0.0625 per month. Calculate the probability that the device lasts (a) not more than 45 month (b) longer than 40 month Solution Let X denote the random variable that measures time until first failure.   0.0625 (a) P ( X  45)   45 0 0.0625e  0.0625 x dx  1  e ( 0.0625 )( 45)  0.9399 (b) P ( X  40)  1  P ( X  40)  1   40 0 0.0625e 0.0625 x dx  1  (1  e ( 0.0625 )( 40 ) )  0.0821 NOTE: 1. The cumulative distribution function of the exponential distribution is given by F ( x)  P( X  x)  1  e x , x0 2. If X has an exponential distribution with parameter  , then E( X )  1  and Var ( X )  1 2 3. If X has the exponential distribution over the interval [a, b] , then P(a  X )  e a  e b Example 34 33 Suppose particles arrive independently at a counter at an average rate of three per second. What is the probability that a particle will arrive (a) Within one second (b) after two seconds Solution  2 (a) P( X  1)  F (1)  1  e 2(1)  0.9502 (b) P( X  2)  1  F (2)  1  (1  e 2 ( 2 ) )  0.0025 Question 1 Suppose the time in days between service calls on a photocopier machine follows an exponential distribution with mean call of 0.02 per day. a. What is the probability that the time until the machine again requires service exceeds 60days? ans (0.3011942) b. What is the probability that the time until the machine again requires service is less than 20days. ans (0.32968) Question 2 The lifetime of a mechanical assembly in a vibration test is exponentially distributed with mean of 400hrs. what is the probability that a. An assembly on test fails is less 100hrs b. An assembly operates more than 500hrs before life. c. An assembly on test fails is at most 200hrs. NORMAL DISTRIBUTION 34 1. Standardize a normal variable and use standard normal tables. 2. Use the normal distribution as model to solve problems 3. Use the normal distribution as an approximation to the binomial distribution and to the Poisson distribution. The normal distribution is one of the most important distributions in statistics. Many measured quantities in the natural sciences follow a normal distribution and under certain circumstances it’s also a useful approximation to the binomial distribution and to the Poisson distribution. The normal variable is continuous. Its probability density function f(x) depends on its mean and standard deviation , where It can be describe as ~ ( , ( )= ( ) , −∞ < √ < ∞ ) Finding probabilities The probability that X lies between an b is written ( < you need to find the area under the normal curve between < ). To find this probability, and . One way of finding areas is to integrate, but since the normal function is complicated and very difficult to integrate, tables are used instead The standard normal variable, Z In order to use the same set of tables for all possible values of , the variable X is standardised so that the mean is 0 and the standard deviation is 1. Notice that since the variance is the square of the standard deviation, the variance is also 1. This standardized normal variable is called Z and ~ (0 ,1). In general To standardize X, where ~ ( , ) 35 1. Subtract the mean 2. Then divide by the standard deviation Therefore = ℎ ~ (0 ,1) Finding the Z- value from standard table Example 35 Find the value of Z using the standard tables. i) P(Z < 0.85) ii)P(Z > 0.85) iii)P(Z < -1.38) iv)P(Z > -1.38) Solution i) 0.8023 ii) 1 – 0.823 = 0.1977 iii) 0.0838 iv) 1 – 0.0838 = 0.9162 Example 36 Find the following a) P(0.35 < Z < 1.76) b) P(-2.70 < Z < 1.87) c) P(| |< 1.43) 1.433 Solution ( ) ∅(1.76) − ∅(0.35) = 0.9608 − 0.6368 = 0.324 ( ) ∅(1.87) − ∅(−2.70) = 0.9693 − 0.0035 = 0.9658 ( ) (| | < 1.43 36 d) P(| |> (−1.433 < < 1.433) ∅(1.433) − ∅(−1.433) = 0.9236 − 0.0764 = 0.8472 OR 2∅(1.43) − 1 = 2(0.9236) − 1 = 0.8472 ) (| | > 1.433) P (Z < -1.433) + P (Z > 1.433) 2 1 − ∅(1.433) 2(1 − 0.9236) = 0.1528 ~ (0 ,1). ℎ ℎ (−1.96 < < 1.96) = 0.95 (−2.58 < < 2.58) = 0.99 Using standard normal tables for any normal variable X given as ~ ( , ) can be standard as = where ~ (0 , 1) Example 37 Length of metal strips produced by a machine are normally distributed with mean length of 150cm and a standard deviation of 10cm. Find the probability that the length of a randomly selected strip is a) Shorter than 165cm b) Within 5cm of the mean Solution 37 a. X is the length ; = 150 ; = 10 ; ~ (150, 10 ). To find prob that- the length is shorter than 165cm ie P(X < 165) = − 150 10 = 165 − 150 = 1.5 10 So ( < 165) ( < 165) = ( < 1.5) ( < 1.65) = ∅(1.5) = 0.9332 b. To find the probability that length is within 5cm of the mean, you need to find (| − 150| < 5) (−5 < − 150 < 5) −5 < 10 − 150 5 < 10 10 (−0.5 < < 0.5) (| | < 0.5 ) = 2∅(0.5) − 1 = 2 × 0.6915 − 1 = 0.38 The probability that the length is within 5cm of the mean is 0.38 Example 38 The time taken by milkman to deliver to the high street is normally distributed with a mean of 12 minutes and a standard deviation of 2 minutes. He delivers milk every day. Estimate the number of days during the year when he takes a. Longer than 17 minutes b. less than ten minutes c. between nine and 13 minutes. Solution 38 X is the time, in minutes, taken to deliver milk to the high street. = Standardized X using a) ( > 17) = ~ ( 12 , 2 ) = > = P(Z > 2.5) = 1 − ∅(2.5) = 1 − 0.9938 = 0.0062 Find the number of days multiply by 365 365 × 0.0062 = 2.263 ≈ 2 On two days in the year he takes longer than 17 minutes. a) P(X < 10) = < = ( < −1 ) = 1 − 0.8413 = 0.1587 Now 365 × 0.1587 = 57.92 ≈ 58 On 58 days in the year he takes less than ten minutes b) (9 < < 13) = = (−1.5 < < < < 0.5 ) = ∅(0.5) − ∅(−1.5) = 0.6915 − 0.06668 = 0.6247 Now 365 × 0.6247 = 228.01 ≈ 228 On 228 days in the year he takes between nine and 13 mins NB Since X is continuous variables, the following are indistinguishable: 39 9 < X < 13 9 ≤ X < 13 9 ≤ X ≤ 13 9 < X ≤ 13 Question 7 The masses of packages from a particular machine are normally distributed with mean of 200g and standard deviation of 2g. Find the probability that a randomly selected package from the machine weights. a) Less than 197g b) more than 200.5g c) between 198.5g and 199.5g Ans: a. 0.0668 b. 0.4012 c. 0.1746 Question 8 The heights of boys at a particular age follow a normal distribution with mean 150.3cm and variance 25cm. Find the probability that a boy picked at random from this age group has height a) Less than 153cm b) more than 158cm c) between 150cm and 158cm d) more than 10cm difference from the mean height Question 9 The masses of certain type of cabbage are normally distributed with a mean of 100g and a standard deviation of 0.15kg. In a batch of 800 cabbages, estimate how many have a mass between 750g and 1290g. Example 39 40 Using the standard normal tables in reverse to find Z when ∅( ) is known. ~ (0 , 1), find the values of a if a) P(Z < a) = 0.9693 e) (| | < b) P(Z >a) = 0.3802 ) = 0.9 Solution a) ( < ) = 0.9693 ∅( ) = 0.9693 = ∅ (0.9693) = 1.87 b) ( > ) = 0.3802 1 − ∅( ) = 0.3802 1 − 0.3802 = ∅( ) ∅( ) = 0.6198 = ∅ (0.6198) = 0.30 or 0.31 = 0.305 Solve c) and d) c) = -0.633 c) d) = -1.41 (| | < ) = 0.9 (− < < ) = 0.9 2∅( ) − 1 = 0.9 2∅( ) = 1.9 ∅( ) = 0.95 41 c) P(Z > a) = 0.7367 d) P(Z < a ) = 0.0793 = ∅ (0.950) 1.64 or 1.65 Using the table in reverse for any normal variable x Example 40 Bays of flour packed by a particular machine have masses which are normally distributed with mean 500g and standard deviation 20g. 2% of the bags are rejected for being underweight and 1% of the bags are rejected for being overweight. Between what ranges of values should the mass of a bag of flour lie if it is to be accepted? = 500 Given < = 20 > = 0.01 ( > ) = 0.01 = 0.02 ( < ) = 0.02 1 − ∅( ) = 0.01 ∅( ) = 0.02 1 − 0.1 = ∅( ) = ∅ (0.02) 0.99 = ∅( ) = −2.06 Z = 2.32 = −2.06 = 2.32 − 500 = 20(−2.06) x – 500 = 2.32(20) − 500 = −41.2 x = 546.4 = 458.8 a) Find the limit within which the central 95% of the distribution is. b) Find the inter quartile range of the distribution. Solution a) ~ (400 , 8 ) (| | < ) = 0.95 42 2∅( ) − 1 = 0.95 2∅( ) = 1 + 0.95 1.95 ∅( ) = 2 ∅( ) = 0.975 = ∅ (0.975) = ±1.96 = = ± ± 1.96 − 400 = (±1.96)(8) (384.32, 415.68) b) The inter quartile range encloses the central 50% of the distribution between the lower quartile and upper quartile . ∅( ) = 0.75 = 0.67 − 400 = 0.67 8 − 400 = (0.67)(8) − 400 = 5.36 ∅( ) = 0.25 = ∅ (0.25) = −0.67 − 400 = −0.67 8 − 400 = (−0.67)(8) = 394.64 (394.64, 405.36) Question 9 43 A sample of 100 apples is taken from a load. The apples have the following distribution of sizes Diameter to nearest cm frequency 6 7 8 9 10 11 21 38 17 13 Assuming that the distribution is approximately normal with mean and this standard deviation. Find the range of size of apples for packing, if 5% are to be rejected as too small and 5% are to be rejected as too large. ( ̅ = 8 , = 1.16 1.158 6.10 , 9.90 ) Question The lengths of metal strips are normally distributed with a mean of 120cm and a standard deviation of 10cm. Find the probability that a strip selected at random has length a) Greater than 105cm b) within 5cn of the mean Strips that are shorter than Lcm are rejected. Estimate the value of L, correct to one decimal place, if 9% or all strips are rejected. In a sample of 500 strips, estimate the number having a length over 126cm. b) (0.383; 106.6, 137) Question 10 Batteries for a transistor radio have a mean life under normal usage of 160 hours, with a standard deviation of 30 hours. Assuming the battery life follows a normal distribution. a) Calculate the percentage of batteries which have a life between 150 hours and 180 hours. Ans: 37.8% b) Calculate the range, symmetrical about the mean, within which 75% of the battery lives lie. Ans:125.5 , 194.5 If a radio takes four of these batteries and requires all of them to be working, calculate c) The probability that the radio will run for at least 135 hours. (0.405) Question 11 The numbers of shirts sold in a week by the world’s largest menswear store a normally distributed with mean of 2080 and a standard deviation of 50. Estimate 44 a) b) c) d) The probability that in a green weak fewer than 2000 shirts are sold. (0.0548). The number of weeks in a year that between 2060 and 2130 shirts are sold. (26) The inter quartile range of the distribution. (67.4) The least number of shirts such that the probability that more than n are sold in a given week is less than 0.02. (2183) Finding The Values of or or Both Example 41 The random variable X is distributed ( , ) with If P(X < 27.5) = 0.3085, find the value of . Solution ( < 27.5) = ( < ) 27.5 − 5 = ( < ) = 0.3085 ∅( ) = 0.3085 = ∅ (0.3085) = −0.5 27.5 − 5 = −0.5 27.5 − = 5(−0.5) 27.5 − = −2.5 = 30 Example 42 45 = 25 The random variable X is normally distributed with mean of 45. The probability that X is greater than 51 is 0.288. Find the standard deviation of the distribution. Solution ( > 51) = ( > ) = 51 − 45 = 6 ( > ) = 0.288 1 − ∅( ) = 0.288 1 − 0.288 = ∅( ) ∅( ) = 0.712 = 0.56 6 = 0.56 = 6 = 10.71 0.56 Example 43 The random variable X is distributed Find . ( , ) . P( X> 80) = 0.0113 and P( X < 30) = 0.0287. Solution ( > 80) = ( > ) 1 − ∅( ) = 0.0113 0.9887 = ∅( ) = ∅ (0.9887) = 2.28 = 80 − 46 80 − = 2.28 + 2.280 = 80 … … . (1) ( < 30) = ( < ) = 30 − ∅( ) = 0.0287 = ∅ (0.0287) = −1.90 30 − = −1.90 − 1.90 = 30 … … … . (2) + 2.280 = 80 = 52.73 = 11.96 Question 12 The masses of boxes of apples are normally distributed such that 20% of them are greater than 5.08kg and 15% are greater than 5.62kg. Estimate the mean and standard deviation of the masses. (2.74, 2.78) Question 13 A farmer cuts hazel twigs to make into bean poles to sell at the market. He says that a stick is 240cm long. In fact the lengths of the sticks are normally distributed and 55% are over 240cm long. 10% are over 250cm long. Find the following a) b) The probability that a randomly selected stick is shorter than 235cm. (0.203) Question 14 47 Tea is sold in packages marked 750g. The mass of the packages are normally distributed with a mean of 760g. It is know that less than 1% of the packages are underweight. What is the maximum value of the standard deviation of the distribution? (4.299g) Question 15 The random variable X is normally distributed. The probability that X is less than 53 is 0.04 and the probability that X is less than 65 in 0.97. Find the inter quartile range of the distribution. (4.46) CHAPTER TWO PROBABILITY CALCULUS 48 CONCEPT OF PROBABILITY This is a measure of how likely it is that something will occur. In talking about probability we need an experiment. An Experiment is any action with outcomes that are recorded data. The number of times we do it is sample space. It is the set of all possible outcomes of an experiment. It is denoted by “S”. When a coin is tossed twice, first outcome is {H,T}, second outcome is {HT,TH,TT,HH}. The sample space is therefore four for the example given. We have ; Finite Sample Space: A sample space which takes integer values or has countable number. Infinite Sample Space: the ages of a class can range from 17 to 30 i.e. 17 ≤ x ≤ 30. An individual can start from 17 and start counting 18, 19… Another person can use 17.01, 17.02, 17.03… etc. This makes it infinite. Event An event ‘A’ is an outcome or the set of outcomes that are of interest to the experimenter. The probability that an event A would occur is written as ( ) and is” read probability of A”. The probability of an event A, ( ) is a measure of the likelihood that an event A would occur i.e ( )= Example 1 An ordinary die is thrown. Find the probability that the number obtained a. Is a multiple of 3 b. Less than 7 c. A factor of six Solution Sample space when die is thrown = {1, 2, 3, 4, 5, 6} 49 a. P(multiple of 3) = = b. P(less than 7) = = 1 c. P(factor of six) = = Compliment of an Event The compliment of an event A is denoted by . If the set of all outcomes in the sample space “S” , that do not correspond to an event “A” . ( )+ ( )= Probability Rule ( ∪ ) = ( ) + ( ) − ( ∩ )………….( ) (1) ( ) . = ( ∪ )= ( )+ ( )− ( ∩ ) ( )= ( ∩ )+ ( ∩ ( )= ( ∩ )+ ( ) ∩ ) Example 2 Given that ( ) = , probability of = and ( ∩ ) = Solution ( ∪ ) = [(1 − ( ) ] + ( ) − ( ∩ ) = 1− + − = + = 50 . Find ( ∪ ) ℎ Exhaustive Events If two events A and B are such that between them, they make the whole of the possibility space, then A and B are said to be Exhaustive events and ( ∪ ) = 1 ⟹ 1− ( )+ ( )− ( ∩ ) Example = {1,2, … ,10} = {2,4,6, … ,10} = {1,3,5, … 9} ( ∪ ) = {1,2,3,4, … ,10} = Exclusive or mutually exclusive events A and B are said to be exclusive or mutually exclusive events if there is no intersection between them; i.e. if they cannot occur at the same time. It is expressed mathematically as ( ∪ )= ( )+ ( ) A i.e P(A∩ ) = 0 B Example 3 It is known that ( ) = ( ) = , given that X and Y are mutually exclusive, find 51 a. ( ∪ ) . ( ∩ ) . ( ∩ ) . ( ∩ ) Solution a. ( ∪ )= ( )+ ( ) = 1 1 + 2 4 = 2+1 3 = 4 4 b. ( ∩ )=0 c. ( ∩ )= Demorgan’s Rule ( ∪ ) = − ( ∪ )= ( ∩ ) ( ∩ ) = − ( ∩ )= ( ∩ ) Example 4 Given that ( ) = then ( ) = then ( ∩ ) = Solution ( ∪ )= ( )+ ( )− ( ∩ ) = 1 1 1 + − 3 2 12 = 4+6−1 9 3 3 = = = 12 12 4 4 52 find ( ∩ ) Question Events A and B are mutually exclusive and exhaustive events. ( ) = 0.4. Find a. ( ) b. ( ∩ ) Solution a. ( )+ ( )=1 ( ) = 1 − 0.4 = 0.6 c. ( ∩ )=0 ℎ Example 5 A and B are two events such that ( ) = then ( ) = and ( ∩ ) = . Are A and B exhaustive events. Solution For exhaustive events ( ∪ )=1 ∴ ( ∩ )= = 2 8 1 + − 3 15 5 10 + 8 − 3 15 = =1 15 15 Independent Events 53 If either of the events A and B can occur without being affected by the order, then the two (2) events are independent. For independent events; ( ∪ ) = ( )+ ( )− ( )∙ ( ) ⟹ ( ∩ ) = ( )∙ ( ) Example 6 If events A and B are such that they are independent and ( ) = 0.3 and ( ) = 0.5. Find a. ( ∩ ) b. ( ∪ ) c. Are A and B mutually exclusive Solution a. ( ∩ )= ( )∙ ( ) = 0.3 × 0.5 = 0.15 b. ( ∪ ) = ( ) + ( ) − ( ) ∙ ( ) = 0.3 + 0.5 − 0.15 = 0.8 − 0.15 = 0.65 c. No Example 7 The probability that an event A occurs, ( ) = 0.4, B is an event independent of A and the probability of the union of A and B ( ∪ ) = 0.7. Find ( ). Solution ( ∪ )= ( )+ ( )− ( ∩ ) = ( ) + ( ) − [ ( ) ∙ ( )] 54 0.7 = 0.4 + ( ) − [0.4 × ( )] 0.3 = ( ) − 0.4 ( ) = ( )[1 − 0.4] 0.3 = 0.6 × ( ) ( )= 0.3 3 1 = = = 0.5 0.6 6 2 CONDITIONAL PROBABILITY If A and B are two events not from the same experiment then the conditional probability that A occurs if given B has already occurred is written as ( , )= ( ∩ ) ( ) ( ⁄ )= ( ∩ ) ( ) ( ⁄ )= ( ∩ ) ( ) ( , )= ( ⁄ ) ⟹ ( ⁄ )∙ ( )= ( ∩ ) ( ⁄ )∙ ( )= ( ∩ ) ( ⁄ )∙ ( )= ( ⁄ )∙ ( ) Example 8 and are two events such that ( ⁄ ) = 0.4 ( ) = 0.25 and ( ) = 0.2. Find 55 ( ⁄ ) a. . ( ∩ ) . ( ∪ ) Solution ( ⁄ )∙ ( )= ( ⁄ )∙ ( ) a. 0.4 × 0.25 = ( ⁄ ) ∙ 0.2 0.1 = ( ⁄ ) ∙ 0.2 ( ⁄ )= 0.1 = 0.5 0.2 ( ∩ )= ( ⁄ )∙ ( ) b. = 0.5 × 0.2 = 0.1 ( ∪ )= ( )+ ( )− ( ∩ ) c. = 0.25 + 0.2 − 0.1 = 0.35 Example 9 If ( ⁄ ) = and ( ) = and ( ) = . Find ( ⁄ ) a. . ( ∩ ) Solution a. ( ⁄ )∙ ( )= ( ⁄ )∙ ( ) ( ⁄ )× 1 2 1 = × 3 5 4 = ( ⁄ )= 2 1 = 20 10 1 3 3 × = 10 1 10 56 b. ( ∩ )= ( ⁄ )∙ ( ) = 2 1 2 1 × = = 5 4 20 10 Independent (Conditional Events) If and are independent, then; ( ⁄ )= ( ) ( ⁄ )= ( ) ( ⁄ )= ( ) ( ⁄ )= ( ) Example 10 and are two independent events such that following probabilities; a. ( ⁄ ) . ( ∩ ) . ( ∪ ) Solution For independent events a. ( ⁄ ) = ( ) = 0.2 b. ( ∩ ) = 0.2 × 0.15 c. ( ∪ )= ( )+ ( )− ( ∩ ) = 0.2 + 0.15 − 0.03 = 0.32 Example 11 57 ( ) = 0.2 and ( ) = 0.15 . Evaluate the and are exhaustive events and it is known that ( ⁄ ) = , ( ) = . Find ( ) Solution ( ⁄ )∙ ( )= ( ∩ ) 1 2 1 × = 4 3 6 ( ∪ )= ( )+ ( )− ( ∩ ) 2 1 1= ( )+ − 3 6 1 2 1 ( ) = 1 + − = = 0.5 6 3 2 Example 12 A box contains 10 balls, of which 6 are red and 4 are blue. If 2 balls are randomly selected from the box without replacement, what is the probability that both are red? Solution Let A be the event that the first ball drawn is “red”, and B, the event that the second is “red”. We are required to calculate P ( A  B ) . The probability that the first ball drawn is red is P ( A)  6 3  10 5 P( B A)  5 9 58 3 5 1 P ( A  B )  P( A) P ( B A)    5 9 3 Example 13 In a consignment of 40 manufactured items, 8 are known to be defective. Suppose three items are drawn at random without replacement. What is the probability that all three in the sample are defective? Solution Letting A1 , A2 and A3 be the events, “getting a defective on the 1st, 2nd and 3rd draw respectively, the desired probability becomes P ( A1  A2  A3 )  P ( A1 ) P ( A2 A1 ) P ( A3 A1  A2 )  8 7 6 7    40 39 38 1235 Use of Combinatorial Analysis The application of the multiplication rule in solving probability problems may sometimes be tedious or confusing. An easier approach is the application of a method of first principles, the combinatorial analysis. Example 14 RefertoExample13 (a) Solve the question using the combinatorial method. (b) Calculate the probability that the sample contains just one defective. Solution The sample space for this problem is the set of all possible 3-tupples defective items that could be selected from 40 items so that the sample space consists of 59 40 C 3 equally likely simple events. (a) There are 8 defective items and so 3-tuples of defective items can be selected in 8 C 3 number of ways. 8 P(all three defectives)  C3 7  C3 1235 40 8 (b) P(exactly one defectives)  C1 .32 C 2 496  40 1235 C3 Total Probability If A1 , A2 , …, An form a partition of the sample space S, then for any event B  S , P ( B)  0 , P( B)  P( A1  B)  P( A2  B)  .. .  P( An  B) n   P ( Ai  B ) i 1 Definition: If A1 , A2 , …, An form a partition of the sample space S and B an event defined on the same sample space S such that P ( B)  0 . Then. n P ( B )   P ( Ai )P ( B Ai ) i 1  P( A1 ) P( B A1 )  P( A2 ) P( B A2 )  .. .  P( An ) P( B An ) Bayes’ Theorem 60 Suppose A1 , A2 , …, An form a partition of the sample space S. Suppose also that the probabilities P( Ai )  0 (I = 1, 2, . . .,n) are known. Let B be any event in S such that P ( B )  0 and suppose P( B A) is also known. Then P ( Ai B )  P ( Ai ) P ( B Ai ) n  P( A ) P( B A ) j j j 1 Or P ( Ai B )  P ( Ai ) P ( B Ai ) P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  ...  P ( An ) P ( B An ) Before we use this theorem or rule, the following condition must be present; 1. We are dealing with an experiment which can result in one of n mutually exclusive events, A1 , A2 , …, An such that the sample space S is given by S  { A1  A2  ....  An } , 2. It is given that the event B has occurred such that P( B)  0 3. We want to find the probability that one of the events A1 , A2 , …, An will occur given that event B has occurred. That is, we want to find P( A i B), i= 1, 2, . . .,n) . Example 15 Suppose P ( A1 )  0.20 , P ( A2 )  0.40 , P ( A3 )  0.40 , P( B A1 )  0.25 , P( B A2 )  0.05 and P( B A3 )  0.10 . Use Bayes’ rule to find P( A1 B), Solution P ( Ai B )  P ( A1 ) P ( B A1 ) P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 ) 61 (0.20)(0.25 )  (0.20)(0.25)  (0.40)(0.05)  (0.40)(0.10) 0.050 0.050    0.455 0.050  0.020  0.04 0.11 Example 16 It is given that only 60% of the students in Mr. Mensah’s class passed the mathematics test at first sitting. Of those who passed, 80% prepared for the test and of those who failed, 20% prepared for the test. What is the probability that a person who passes prepared for the test? Solution Let define the events, A1 ; Passing the test at a first sitting A2 ; Failing the test at first sitting Then the prior probabilities are, P( A1 )  0.60 , P( A2 )  0.40 , If we let B denote the event that a person prepared for the test, then the conditional probabilities are P( B A1 )  0.80 , P( B A2 )  0.20 We are required to find the posterior probability P( A1 B), that a person who prepared for the test passed the test at first sitting. Using Bayes’ rule for two mutually exclusive events, we have P ( Ai B )  P ( A1 ) P ( B A1 ) P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 ) 62 (0.60)(0.80) (0.60)(0.80)  (0.40)(0.20) 0.48 0.48    0.86 0.48  0.08 0.56  Example 17 A Mechanical factory employs three machine operators to produce its brand of goods. Operator A works 50% of the time, Operator B works 30% of the time, and C, 20% of the time. Each operator is prone to produce defective items. Operator A produces defective items 1% of the time, Operator B produces defective items 5% of the time, and Operator D produces defective items 7% of the time. If a defective item is produced, what is the probability that it was produced by (a) Operator A (b) Operator B (c) Operator C Solution Let us define the events A1 ; Operation A, A2 ; Operation B and A3 ; Operation C The prior probabilities are P ( A1 )  0.50 , P ( A2 )  0.30 , P( A3 )  0.20 . We also know that P( B A1 )  0.01, P( B A2 )  0.05 , P( B A3 )  0.07 (a) P ( Ai B )   P ( A1 ) P ( B A1 ) P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 ) (0.50)(0.01)  0.147 (0.50)(0.01)  (0.30)(0.05)  (0.20)(0.07) (b) P ( A2 B )  P ( A2 ) P ( B A2 ) P( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 ) 63  (0.30)(0.05)  0.441 (0.50)(0.01)  (0.30)(0.05)  (0.20)(0.07) (c) P ( A3 B )   P ( A3 ) P ( B A3 ) P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 ) (0.20)(0.07)  0.412 (0.50)(0.01)  (0.30)(0.05)  (0.20)(0.07) Question 1 Two events A and B are such that ( ) = , ( )= ( ⁄ ) = . Calculate the probability that a. Both events occur b. Only one of the two events occurs c. Neither events occur Question 2 Three balls are drawn at random from a bag containing 15 green and 12 yellow balls. What is the probability that (a) all three are green? (b) none is green? (c) one is green and the other two are yellow? Question 3       1. P(A1)=0.20, P(A2)=0.35, P(A3)=0.45, P B A1  0.01 , P B A2  0.05 and P B A3  0.01 . Use Bayes’ rule to find (a) P A1 B  (b) PA2 B  (c) PA3 B  64 Question 4 A Mechanical factory employs three machine operators to produce its brand of goods. Operator A works 50% of the time, Operator B works 40% of the time, and C, 10% of the time. Each operator is prone to produce defective items. Operator A produces defective items 1% of the time, Operator B produces defective items 3% of the time, and Operator D produces defective items 5% of the time. If a defective item is produced, what is the probability that it was produced by (b) Operator A (b) Operator B (c) Operator C 65 CHAPTER 3 ESTIMATES AND ESTIMATORS Estimator: It is the rule of procedure(usually expressed as a formula), that is used to derivethe  xi is the estimate of the population mean. An estimator is therefore estimate. For example, x  n the process by which the estimate is obtained. Estimate: An estimate is a numerical result of the estimator. For example, if the value of the estimator x is say 15, then 15 is the estimate of the population mean. Properties of estimators The four properties of estimators are unbiased, efficient, consistent and sufficient estimators. Unbiased Estimator: An estimator is unbiased if the mean of the sampling distribution equals the corresponding parameter. For example, if theta (  ) is a parameter we are trying to estimate, then “theta hat” ( ˆ ) is an unbiased estimator if its mean or expected value E (ˆ)   . Furthermore, if E ( x )  x   , then x is an unbiased estimator of  . The measure of bias is the difference between the mean of ˆ and  . Hence, if E (ˆ)    0 , then ˆ is a biased estimator of  . Efficient Estimator: Given any unbiased estimators, the most efficient estimator is the one with the smallest variance. For example, let ˆ1 and ˆ2 be two unbiased estimators of  . ˆ1 is more efficient estimator if in repeated sampling with a given size, its variance is less than that of ˆ . 2 Consistent Estimator: An estimator is consistent when as n increases, the value of the statistic approaches the parameter. Hence, for an estimator to be consistent, it must be unbiased and its 2  x variance must approach zero as n increases. Thus   2 , as n gets larger  x2 will approach zero n and therefore x is said to be consistent estimator of  . This also suggests that if a statistic is not a consistent estimator, taking a larger sample to improve the estimator will be fruitless. Sufficient Estimator: An estimator is said to be sufficient if no other estimator could provide more information about the parameter. Furthermore, an estimator is sufficient if it uses all relevant information about the parameter contained in the sample. If an estimator is sufficient, nothing can be gained by using other estimator. 66 ESTIMATION OF PARAMETERS For the purpose of this course, we shall look at point estimation of population mean, proportion, standard deviations and interval estimations. Point Estimation If a single number is used to approximate the true value of a parameter, the number is called a point estimate of the parameter. This is usually done since there may be no prior knowledge of the parameter before the study of the population. Point Estimate of the Population Mean An engineering students was interested in the diameters of 2-inch nails. The mean of the population was  . Since it was not easy to work with all 2-inch nails in the world, he took a sample with a mean x to estimate the population parameter, but with a property of unbiasness. Example, the manager of an automobile company wishes to determine the number of vehicles to order each week. If he orders too many, there will be a problem of space; if too few, the supply may run out. To get some idea of the number required, a sample of twelve randomly selected week is obtained, and the number of vehicles sold per week recorded as shown: 10, 12, 7, 7, 9, 9, 11, 15, 13, 6, 7, 8. If it is not possible to obtain the true value of the population mean because it is infinite.  xi   Here, the sample mean will be the best estimate for the population mean. Thus x  n Point Estimation of the population proportion The population parameter P is the ratio of the number of the population with the attribute to the X population size. This is given by P  . Since P is unknown, an estimate (sample proportion) is N x used. This is given by pˆ  . i.e., the ratio is the number x in the sample with the attribute to the n sample size. Note that the sample proportion is also an unbiased estimator of the population proportion. For example, suppose in Ghana, 1000 Filling Stations were selected and after taking their fuels trough testing, it was found that 500 of them sell impure fuels to drivers. Then the sample 500 proportion ( pˆ ) is given by pˆ   0.50 or 50% 1000 67 Interpretation: It is estimated that 50% of the fuel dealers in Ghana sell impure fuels to their customers. If the population of the Filling Stations in Ghana is 5000, we may conclude that 2500 Filling Station operators on Ghana sell impure fuel to customers. Point Estimation of the Population Standard Deviation The sample estimate for the population variance is s 2  1 n  ( xi  x ) 2 ; xi is the sample n i 1 observation and i  1, 2, 3, ... , n . For the sample variance to be unbiased estimator of the population variance we need to correct the undesirable property of the denominator with n  1 . 1 n Hence the unbiased estimator of the population variance is given by s 2  ( xi  x ) 2 . Also  n  1 i 1 s 1 n  ( xi  x ) 2 . Hence the positive square root of the variance is used as an unbiased n  1 i 1 estimator of the population standard deviation. Note that for large sample size, there is little or 1 n 1 n no difference between s 2   ( xi  x ) 2 and s 2  ( xi  x ) 2 .  n i 1 n  1 i 1 Interval estimation Point estimates are the most commonly used. Point estimates are expected to coincide with the parameter they intend to estimate even if they are biased. They are unable to tell us the size of the errors of the estimations. Hence interval estimates are therefore suitable. Interval estimate (confidence interval) is an interval of numbers which are used to estimate or approximate the true value of a parameter  . To estimate the true value of  , an interval of numbers L1 and L 2 (end points) are determined with confidence coefficient 1   , such that the probability of estimating the true value of the population with a certain interval is P ( L1    L 2 )  1   . For example, a confidence coefficient of 0.95 means that if 100 different samples are drawn and for each sample, an interval estimate for the unknown parameter  is calculated, then at least 95% of these confidence intervals would include  . Hence, we are 95% sure that the parameter will be between L1 and L 2 . Confidence Interval on the Mean (Variance Known) Consider a random sample of size n taken from a normal population with mean  and variance  2 . Then a 100(1   )% confidence interval on  is given by 68 x  Z .     x  Z . n 2 2 Or simply x  Z  .  n  n 2  , where x is the sample mean. is the standard error of the mean. n Example 1 If n = 25, x  9 and   3 , find the 95% confidence interval on the mean  . Solution 100(1   )%  95%    0.05 or 5%. Z   Z 0.05  Z 0.025  1.96 from the table. 2 2 But we know that x  Z  . 2 Therefore 9  1.96 ( 3 25  n    x  Z . )    9  1.96 (.  2 n 3 ) 25  7.82    10.18 Interpretation: We are 95% sure that the true mean will lie between 7.82 and 10.18. Confidence Interval on the Mean (Variance Unknown) When the value of the population standard deviation is unknown, we replace it with the corresponding standard deviation s. The expression will hereafter follow the t-distribution with x  o and n-1 degree of freedom. The confidence interval on the mean therefore becomes t s n x  t . 2 s n . To read the t value from the table we use t  .( n  1) . 2 Example 2 With reference to Example 2.1, if the value of  was unknown, but its estimate was to be s  2.8 . find the 95% confidence interval on the mean  . 69 Solution Note that  is not known. n = 25, x  9 , s  2.8 and t 0.025 .( 24 )  2 .064 x  t . 2 s n 9  2.064 ( 2 .8 25 )    9  2.064 (. 2 .8 25 )  7.82    10.18 Therefore the confidence interval for the mean is 7.82    10.18 . Confidence Interval onthe Difference between Two Means; Given two independent random samples with means ̅ and ̅ their respective sizes n1 and n2 from normal populations with means µ1 and µ2 and variances and we can determine the confidence interval on µ1 and µ2 considering three separate conditions (cases) Case 1 Where and , are known, the confidence interval on the difference between the two population means is given by: ( ̅ - ̅ )- / + ≤ µ1 − µ2 ≤ ( ̅ - ̅ ) + / + Case 2 Where and are known butn1 and n2 are large, we use the sample variance in place of the population variance. The confidence interval on µ1 - µ2 is therefore given by 70 ( ̅ - ̅ )- + / ≤ µ1 − µ2 ≤ ( ̅ - ̅ ) + / + Example 3 Two sample of sizes 100 and 64 observations were drawn from independent normal populations with variances 16 and 25 and sample means 10.8 and 9.6 respectively. Find a 95% confidence interval on the difference between the two population means (µ1 - µ2). Solution n1 = 100, n2 = 64, ̅ = 10.8, = 16, ̅ = 9.6, = 25 The required equation is ( ̅ - ̅ )± (10.8 - 9.6) - + / . + (-0.25, 2.65) Hence the confidence interval is -0.25 ≤ µ − µ ≤ 2.65 Case 3 Where and are unknown, but n1 and n2 are small, the appropriate test statistic used is the t-distribution. Here, the sample variance common to all (pooled variance) is preferred. Hence, the confidence interval on the difference between the population means is given by 71 ( ̅ - ̅ )- ≤µ − µ ≤( ̅ - ̅ )- + ∝ If we assume that = ∝ is with + ( − 1) Where ∝ ∝ + ……… (1) + ( − 1) + −2 – 2 degree of freedom. Secondly if no assumption of equality of population means is given by ( ̅ - ̅ )± + then, = Note that ∝ and is made then the confidence interval on the ………………….. (2) is approximately t-distribution with the degree of freedom given by; d.f = It has been proved that if the sample sizes are equal, then difference between equation 1 and 2 is very small (negligible) and hence, the formula 1 is suitable. TRY Two independence samples of sizes n1 = 16 and n2 = 10 from a normal population with unknown standard deviations have sample means, ̅ =23.4 and ̅ =18.2and corresponding S1 = 3.5 and S2 = 4.8. Find a 90% confidence interval on (µ1 - µ2) a. Assuming that the population standard deviation are equal b. Assuming that the population standard deviations are not equal. 72 ANS: a) 2.41 ≤ µ1 - µ2 ≤7.99b) 1.67 ≤ µ1 - µ2 ≤8.73 Choosing an Appropriate Sample Size This is very essential in statistical study or analysis. It deals with the sample size to take, so that your results will be closer to the population parameter, thereby, the sample being representative of the population of study. This is important because, a sample being too large, wastes money, in terms of data collection and a sample being to small results in conclusions which are uncertain or which leads to bigger standard errors (errors). The Correct Sample Size Depends On 3 Factors: 1. The level of confidence desired. Very often 95% and 99% confidence level are selected leading to Z-values of ± 1.9 and ± 2.58 respectively. The higher the level of significance, the larger the size of the sample. 2. The margin of error the researcher will tolerate. The maximum allowable error E, is the amount that is added and subtracted from the sample mean to determine the end points or limits of the confidence intervals. It is the amount of error the researcher is willing to tolerate. A small allowable error means one should take a large sample and allowable error requires a smaller sample. 3. The variability in the population being studied. This deals with the population standard deviation. If the population is widely dispersed, a large sample is needed. Also if the population is concentrated or homogeneous (i.e. no widely dispersed) a smaller sample is required. However, estimating for the population standard deviation may be necessary and those are considerations. a) Use the comparable study approach, when there is an estimate of dispersion available from another study. Fall on that to get an idea of the rough sample size and estimate from that b) If not available a range-based approximation can be set. That is the correct observation lies within ± 3 standard deviations of the mean if the distribution is normal. Establishing the largest and the smallest values (in all 6 ). c) A pilot survey conducted can help estimate the standard deviation. This helps to test the validity of our questionnaire too. Here a very small sample is taken and the standard deviation computed. From the 3 factors: E=Z √ 73 ∴ Sample size for estimating a mean is n= . n is the size of the sample z is the standard normal value corresponding to the desired level of confidence s is an estimate of the population standard deviation E is the maximum allowable error To determine the sample size for a proportion: 1. The desired of confidence, usually 95% or 99% 2. The margin of error in the population proportion. This is required. 3. An estimate of the population proportion. Determination of Sample Size for Estimation: We have observed from previous discussions on estimation that to obtain maximum accuracy or precision in estimation, it will depend to a large extend on the sample size. Hence, a large sample size ensures precise estimation of the confidence interval on the mean. To determine the least sample size with a 100(1-α) % confidence interval, we use the formula. n≥ Where d = √ and d is the distance between the center of confidence and the upper confidence bound. Note also that n is always a whole number. Example 3.4 To estimate the mean height of males in a certain community to within 2cm with 99% confidence and the standard deviations is 6cm at a minimum sample size is given by Z0.005 = 2.58 74 ⇒n≥ n≥ ( . ) n≥ 59.91 ⇒ n=60 Thus, the sample size required to estimate the mean height of males to within 2cm with 99% confidence interval is 60 HYPOTHESIS TESTING A hypothesis is a statement which is yet to be proved true or otherwise. In hypothesis testing, an idea concerning a parameter is available before the study and the purpose the study and the purpose of the study is to collect data to confirm or otherwise the stated idea. There are two types of hypothesis: a. The hypothesis available before the research and b. Its negative The hypothesis before the research is conducted is called the null hypothesis and it is usually denoted by H0. Whereas is negative is called the alternative hypothesis and it is denoted by H1. In fact, the purpose of hypothesis testing is to reject or refute the null hypothesis (H0). Hence H0 can either be rejected or we fail to reject H (i.e. accept). There are four possible decisions involved and we therefore summaries then on the table: H0 is true Reject H0 Type I error Fail to reject H0 Correct decision The following conclusions are arrived at     H0 is false Correct decision Type II error Reject H0 when it is true (wrong decision) – type I error Reject H0 when it is false (correct decision) Fail to reject H0 when it is true (correct decision) Fail to reject H0 when it is false (wrong decision)- type II error. Note that our decision to reject the null hypothesis or otherwise will be based on the test statistic – the value calculated from the sample data. H0 or otherwise will be based on the test statistic – the value calculated from the sample data. There are two ways of choosing between H0 and H1. One way is to find the rejection region (critical region) of the test. Thus, if the calculated value of the test statistic is greater than the “table value”, then H0 is rejected and vice versa. The second 75 way is to calculate the P- value of the test. The P-value of the test statistic at least as extreme as that observed under the null hypothesis. The H0 is rejected for “small” P-value. That is P<0.05. If V cannot be rejected, the conclusion is the there is no enough evidence to reject H0 or we conclude that we fail to reject H0 due to insufficient evidence. The use of a particular test of hypothesis depends to a target extend on the nature of data (either quantitative or qualitative). Test for equality in Means: 1. If we wish to test the H0 that µ = µ , three alternatives can be performed as follows: i) H0: µ = µ , ii) H0: µ = µ , iii)H0: µ = µ , H1: µ ≠ µ , H1: µ < µ , H1: µ > µ , Note that i. ii. Is called a two sided or two-tailed test (non-directional) and Are called one-sided or one tailed test (directional) Suppose that the population we are sampling from is normal and use is the normal deviate (Z). Thus Z = is known, the test statistic to ̅ √ Using α level, the critical regions for testing the hypothesis against the alternative hypothesis are summarized in the table below; H1 µ<µ µ>µ µ≠µ Reject H0 if <> <- or Z> To use the P-value instead of the α-value we find it using the formula below. 76 ≥ P=P i.e. the p-value is the probability of the z-value. We reject H0 if the P-value √ <0.05 vice versa. Suppose that the population we are sampling from is normally distributed and is unknown and the sample size is small, then for H0:µ=µ0 against the alternatives, the appropriate test statistic to use is the t-test, given by = √ , With (n-1) degree of freedom Example 5 Suppose we wish to test H0:µ=10 against one sided alternative (H1:µ>10) at=0.05 given that n = 64, = 10.3 and S = 4, then we proceed as follows Solution Sine n is large (i.e. n> 30) we use the Z Hence Z= . =0.60 √ Now Z0.05 = 1.65 Using method 1, we compare Z and Za values; since Z = 0.60 <Z0.05 = 1.65, we fail to reject the H0 and conclude that µ = 10. Using method 2, we can find the P-value as follows P=P ≥ =P ≥ √ . √ P-value = P(Z≥0.06) = 0.2743, Since p-value = 0.2743 > 0.05, we have no evidence to reject H0. Hence µ = 10. Comparing Two Independent Means (Independent Data): Given two independent random samples from two populations with ̅ and ̅ and sample sizes n1 and n2respectfully such that their respective population parameters are µ1 and µ2, and , we can test H0:µ1=µ2against H1:µ1≠µ2 or H1:µ1<µ2 or H1:µ1>µ2under some assumptions about the population. 77 Assumption 1 That and are known. ThenZ = ̅ ̅ Assumption 2 That and ̅ Z= are not known and sample sizes are large. Then, ̅ Assumption 3 That and lessons are not known but n1 and n2 are small will provide again two scenarios as seen in previous Scenario A When the population variance t= ̅ Where and are assumed equal, ̅ = ( ) ( ) with + − 2 degrees of freedom Scenario B That and are assumed not equal t = ̅ ̅ Whered.f = Example 3.6 78 Test H0: µ1 = µ2against H0: µ1≠ µ2at 5% significance level when n1 = 100, n2 = 64, ̅ = 10.8, ̅ = 9.6, = 16 and = 25 Solution: We know from our previous discussion thatZ = = ∝ . . = 1.62 = 1.96, . Hence we cannot reject the null hypothesis, since Z=1.62 does not fall in the rejection region s shown on the diagram below. Rejection Region Acceptance Critical/Rejection Region Region -1.96 0 1.96 Example 3.7 Test H0: µ1 = µ2 against H1: µ1>µ2 at 10% significance level when n1 = 16, n2 = 10, ̅ = 24.4, ̅ = 18.2, = 3.5 and = 4.8 Solution: If we assume that the population variances are equal, then, we find t and the pooled variance (Sp) as follows: ( = . t= . . = )( . ) . , ( )( . ) = 4.04 = 3.19 = 1.318 Since the calculated t-value is greater than the t-critical, we reject the null hypothesis and conclude that µ1> µ2 79 Comparing two Means (Paired Data) This concerns observations that occur in pairs. E.g. ( ( , ), , ), ( , ),………………………., Thus, observations “before” and “after” an experiment on n individuals. Here, the problem reduces to a single test. Example 3.8 The data below are the weights before and after ten students were fed with a weight reducing diet. 1 2 3 4 5 6 7 8 9 10 Before (xi) 69 50 61 72 78 66 75 89 86 54 After (yi) 66 49 63 70 71 65 75 88 87 51 Solution: d = yi – xi: -3, 2, -2, -7, -1, 0, -1, 1, -3 ̅ = −1.5, S = 25 H0: µ = 0 against H1: µ < 0 t= = . . √ = -1.897 √ tα = t0.05,9 = 1.833 Since the calculated t-value falls in the rejection region, we reject the null hypothesis and conclude that there is significant reduction in weight of the ten students. Question 1 The mean length of a small counterbalance is 43mm. There is concern that the adjustments of the machine producing the bars have changed. Twelve bars were selected at random and their lengths recorded. The lengths are (in millimeter); 42, 39, 42, 45, 43, 40, 39, 41, 40, 42, 43 and 42. At 0.02 level of significance, (a) has there been a statistically significant change in the mean length of the bars? (b) calculate the confidence interval for the mean. (c) comment on your results in (a) and (b) above 80 CHAPTER 4 PROCESS CONTROL Process control is concerned with controlling the quality of goods being manufactured in the production process. It, infact, controls the quality of the goods to be produced. Process control has the object of determining whether the production process is going on as desired turning out product units of a requisite standard. This is achieved through the use of several control charts. A process is said to be in control when its results in future would be same as they had seen the past. Technically the process is said to be under control if the means of sample lots, X are within the control limits around the grand mean X . The process is said to have gone out of control if there has been a change in the process mean from the population mean to some other value. A sample average falling outside the control limits suggests strongly that the process is out of control, that is an assignable cause rather than a chance cause has created the difference. Control Charts Control Charts serve the purpose of alerting those responsible to the possibility that a process is not working as expected. In doing so, the control charts make use of measurement of a particular dimension, performance characteristic, or other continuously scaled variable such as the diameter of a bolt. Control Charts may also be use on attributes, i.e the fraction defective within sample or simply a count of the number of defectives in a sample. All control charts are prepared more or less, on the basis of the same statistical technique. They are graphic devices for detecting unnatural pattern of variations in data resulting from repetitive processes. Usually the standards of products are specified to which the quality must confirm. These standards also specify limits within which the quality of a product must. Thus there two control limits viz, the upper control limits (UCL) and the lower control limits. (LCL) 81 At regular periodical intervals, random samples are taken and the relevant date plotted on the graph. If the sample points are within control limits (though they all may not be on the central or the standard line) then it does not call for any corrective action and the process is said to be under control. But if the sample points deviate considerably from the control limits, then the process is said to have gone out of control and in such a situation, the concerning officer must inspect, examine and set right the process. A control chart is a device for recording the said characteristics on a continuous chart in which the horizontal scale is either time, if samples are taken at regular intervals, or simply batch serial numbers. It consists of a horizontal central line which represents a mean value of the variable, an UCL Variable attribute Quality scale UCL and LCL as shown in the following typical control chart: Central line (Mean) LCL 1 2 3 4 5 6 7 8 11 12 9 10 Sample number or batch serial number 13 Once the chart as shown above is prepared then it is possible to indicate on it the values of succeeding samples, whether these are sample means, sample ranges fraction defective in the sample or number of defectives. If the points fall within limits, then everything is okay so far as the given process is concerned but if a point falls outside the limits then there is reason to think that there is something wrong with the process under consideration. 82 Control charts are generally developed as stated above. A larger number of samples are taken randomly and the mean value is worked out. The limits are then set in accordance with the formulae, described below for different control charts. Control Charts for variables There are two control charts often used for statistic of variables X-chart and R-chart. Control charts for variables make use of actual measurements of items in sample of size n, treating a single variable. 1. X -Chart (Control Chart for mean) Is constructed as follows. For each sample, the mean ( X ) is determined as under X   Xi n i  1, 2, 3, . . . , n And the range (R) is worked as R  X max  X min To establish the chart, ‘k’ samples of size ‘n’ are first taken, and the mean of the sample means, which serves as the central line, is given by X  Xi k i  1, 2, 3, . . . , k It also necessary to compute the mean range i.e. R of the ‘k’ samples as below. R  Ri k i  1, 2, 3, . . . , k The control limits of X-chart then are UCL  X  A2 R and LCL  X  A2 R Where A2 are obtained from tables In case the population mean (  ) and standard deviation (  ) are given, then X -chart is constructed as under: The central line of X chart lies at and the control limits are worked out as follows: 83 UCL    3 n and LCL    3 n Alternatively, the limits can as well be started as   A In each case A  3 n and the value of A which depends upon the sample size ‘n’ can be read form the tables. 2.R- Chart (Range Chart) This is designed to ensure that the variability within samples does not exceed specified limits. It is constructed as follows: Central line = R and control limits are: UCL  D 4 R and LCL  D3 R Where the coefficients D3 and D4 depend upon sample size ‘n’. They are read from table. If D3 R comes out to be negative value, then it is taken as zero. In case the standard deviation (  ) is specifies, then R-chart is constructed as under: Central line = D2 and the control limits are: UCL  D2 and LCL  D1 Where  is the value of standard deviation, D1 and D2 are values found from tables 84 Control Charts for attributes. Control Charts statistics of attributes are prepared when the quality is expressed as either good or defectives. In case of attributes the following control charts are generally used: 1. fraction defective chart (or the ‘p’ chart) 2. number defective chart (or the ‘np’ chart) 3. number of defects per unit chart (or the ‘c’-chart) 3. p-chart or Fraction defective chart This is based on binomial distribution and it is constructed as follows If samples of size n are taken at intervals and in each of them is computed p=c/n Where c is the number of defective items, the mean fraction defective (p) is used in p-chart in which central line = p and the control limits are: UCL  p  3 and LCL  p  3 where p is the mean value of p and must be obtained from a large number of k initial samples. For each of the pi are determined which leads to p    pi ; i  1, 2, . . . , k k p (1  p ) n Before using p-chart, it is important to plot on it all the k points that have gone into its preparation and to eliminate all those points that fall outside the limits, finally recomputingp using only samples whose value of p has fallen within the limits. It may as well be noted that in order to be able to use statistical theory underlying such charts, it is essential that n > 60. 4. ‘np’ –chart or number defective chart It is possible to use the number of defectives in control charts rather than the fraction of defectives and accordingly one can prepare number defective chart or what is known as the ‘np’-chart. Such a chart shows the actual number of defectives found in each sample. If the sample size is constant, then the plotting of the actual number of defectives may be more convenient than the fraction defectives. There is not much difference between the ‘np’-chart and the ‘p’-chart. 85 5. ‘c’-chart or number or defects per unit chart. This chart is based on the Poison distribution. It is constructed as follows: ‘c’-chart is used for the control of the number of defects observed per unit and is useful in many situations in industry. If constant number of units is inspected per sample and the number of defects found in each sample is ascertained, then the number of defects may be assumed to follow Poison distribution. When constructing the chart, k samples each of size n are taken and the number of defects, c, for each item is determined and then number of defect is obtained as follows. c  ci  ni i  1, 2, . . . , k This mean value c is the central line of ‘c’ chart and the control limits are worked out as below: UCL  c  3 c LCL  c  3 c Example 1 A manufacturer of rope selects six samples of five ropes each and test the breaking strength of each, construct an chart and chart for the data shown in table 2.2 Sample Breaking Strength (pounds) 1 46 47 45 46 47 2 50 51 52 53 49 3 48 51 50 50 49 4 52 50 49 50 51 5 51 47 46 48 47 6 49 51 50 51 52 Using, . = = . . 86 R . 46.2 2 51.0 4 49.6 3 50.4 3 47.8 5 50.6 3 295.6 20 ⋯ . = 49.3 Also, ⋯ = = = 3.33 The mean and the range for each sample are shown in the last two columns of the table. The grand mean and the mean range are computed. With this information, the UCL and LCL for X can be determined. Since each sample size is n = 5, Table 3 in Appendix I reveals to be 0.577. Then: UCL = + = 49.3 + (0.577)(3.33) = 51.22 LCL = − = 49.3 - (0.577)(3.33) = 47.3 UCL = 51.22 = 47.38 LCL = 47.38 :Figure 2: -Chart for Rope Manufacturer 87 Notice that the mean for subgroup 1 reveal that the process is out of control. The mean have decreased to a level exceeding the LCL, indicating the presence of assignable cause variation. Example 4.2 Now, consider the problem faced by Okraku, the director of quality control measures for Taroxy Industries. His plant produces frames for desktop computers which must meet certain size specifications. To ensure these standards are met, Okraku collects k = 24 samples (subgroups), each of size n = 6, and measures their width. The results are reported in the table below: Sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 15.2 16.2 15.6 18.5 17.5 14.3 15.4 18.0 14.2 15.7 14.8 16.8 15.2 15.4 18.4 16.5 15.2 16.8 13.5 19.8 18.7 17.5 14.9 18.7 Sample Measurements 14.5 15.4 16.5 15.9 15.4 15.9 15.2 15.2 16.5 15.9 16.2 15.9 14.8 15.7 15.2 16.8 15.7 14.5 14.2 14.5 15.9 16.5 14.8 15.4 15.2 15.4 15.8 14.2 14.5 14.4 16.2 14.8 15.6 14.5 16.1 15.7 16.5 14.5 14.8 16.8 14.5 16.5 14.9 15.8 15.8 15.2 15.8 15.7 15.9 14.5 15.1 15.9 15.7 16.8 15.3 14.8 15.7 15.9 14.8 15.5 16.8 15 15.7 16.9 16.9 16.8 17 17.1 17.2 18.9 18.5 18.5 17.6 18.7 21.1 17.2 14.5 20.8 19.2 19.2 17.9 18.7 20.8 18.4 18 18.2 2.2 14.2 18.9 20 16.8 16.2 17.9 17.4 18.7 17.2 88 16.2 14.5 16.2 14.2 15.2 14.8 15.7 16.8 15.9 16.1 16.3 16.2 14.7 14.9 14.8 14.7 15.4 18.9 16 18.7 17.5 17.8 18.5 16.5 15.6167 15.4000 16.0500 15.8667 15.2667 15.2833 15.2833 15.7833 15.3333 15.7333 15.4667 15.9167 15.2167 15.4833 15.8500 15.9333 16.4000 18.1333 17.3500 18.7000 18.6667 14.6500 17.5500 17.7333 388.6667 R 2 1.7 0.9 4.3 3.3 2.2 1.6 3.6 1.9 2.3 2 1.6 1.4 2 3.6 2.2 1.9 2.1 7.6 6.3 3.3 6 5.1 2.2 71.1 . = Using, . ⋯ . . = = 16.3194 Also, = = . . . ⋯ . . = 2.9625 The mean and range for each sample are shown in the last two columns of the table. The grand mean and the mean are computed. With this information, UCL and LCL for X can be determined. Since each sample size n = 6, Table C in Appendix I reveals A2 to be 0.483. Then, UCL = + = 16.3194 + (0.483)(2.9625) = 17.75 LCL = − = 16.3194 - (0.483)(2.9625) = 14.89 Figure 3 which was produced using SPSS, is the control chart for Okraku. Notice that the mean subgroups 18, 20, and 21 reveals that the process is out of control: the means have increased to levels exceeding the UCL, indicating the presence of assignable cause variation. Perhaps over time the machines producing the computer parts have suffered unusual wear, resulting in improper performance. 89 Xbar Chart of Frame of Desktop Computers 1 19 Sample Mean 18 UCL=17.991 17 __ X=16.194 16 15 LCL=14.398 14 1 3 5 7 9 11 13 15 17 Sample Measurements 19 21 23 Figure 3 Or the variation might have been caused by introduction of inferior raw materials obtained from new supplier around the time sample 18 was taken in any event, Okraku must locate and correct the cause for the unacceptable variation. Estimating Process Capability:The andR charts provide information about the performance or process capability of the process. From the chart, we may estimate the mean size of frame for desktops computers as = 16.3194 . The process standard deviation may be estimated using equation (2.10); that is, = = . . = 1.0957≈ 1.10 where the value of d2for samples of size six is found in Appendix Table VI. The specification limits on size of frame for desktops computers are 16.32 ± 0.5. Assuming that size of frame for desktops computers is a normally distributed random variable, with mean 16.3194 andstandard deviation 1.0957, we may estimate the fraction of nonconforming frame for desktops computers as p = P {15.82 <X< 16.82} =Φ . . . 90 < < . . . = Φ(−0.45 < < 0.45) = 2Φ (0.45) = 2(0.1736) = 0.3472 Another way to express process capability is in terms of the process capability ratio(PCR)Cp, which for a quality characteristic with both upper and lower specification limits (USL and LSL, respectively) is = UCL − LCL 6σ Note that the 6 spread of the process is the basic definition of process capability. Since σ is usually unknown, we must replace it with an estimate. We frequently use = /d2 as an estimate of , resulting in an estimate of ofCp. For size of frame for desktops computers, since = /d2 = 1.0957, we find that = . . ( . ) = 0.1521 This implies that the “natural” tolerance limits in the process (three-sigma above and below the mean) are outside the lower and upper specification limits. Consequently, a moderately greater number of nonconforming frame of desktop computers will be produced. The PCR Cpmay be interpreted another way. The quantity = 1 100% is simply the percentage of the specification band that the process uses up. For the frame of desktop computers an estimate of P is = 1 100% = 1 100% = 657.46 0.1521 That is, the process uses up about 657% of the specification band. Interpretation of and R Charts Note that, a control chart can indicate an out-of-control condition even though no single point plots outside the control limits, if the pattern of the plotted points exhibits nonrandom or systematic behaviour. In many cases, the pattern of the plotted points will provide useful diagnostic information on the process, and this information can be used to make process modifications that reduce variability (the goal of statistical process control). Furthermore, these patterns occur fairly 91 often in phase I (ret- rospective study of past data), and their elimination is frequently crucial in bringing a process into control. In this section, we briefly discuss interpretation of control charts, focusing on some of the more common patterns that appear on and R charts and some of the process characteristics that may produce the patterns. To effectively interpret and R charts, the analyst must be familiar with both the statistical principles underlying the control chart and the process itself. Additional information on the interpretation of patterns on control charts is in the Western Electric Statistical Quality Control Handbook (1956, pp. 149–183). In interpreting patterns on the chart, we must first determine whether or not the R chart is in control. Some assignable causes show up on both the andR charts. If both the andR charts exhibit a nonrandom pattern, the best strategy is to eliminate the R chart assignable causes first. In many cases, this will automatically eliminate the nonrandom pattern on the chart. Never attempt to interpret the chart when the R chart indicates an out-of-control condition. Cyclic patterns occasionally appear on the control chart. Such a pattern on the chart may result from systematic environmental changes such as temperature, operator fatigue, regular rotation of operators and/or machines, or fluctuation in voltage or pressure or some other variable in the production equipment. R charts will sometimes reveal cycles because of maintenance schedules, operator fatigue, or tool wear resulting in excessive variability. A Mixture is indicated when the plotted points tend to fall near or slightly outside the control limits, with relatively few points near the center line.A mixture pattern is generated by two (or more) overlapping distributions generating the process output. The severity of the mixture pattern depends on the extent to which the distributions overlap. Sometimes mixtures result from “over control,” where the operators make process adjustments too often, responding to random variation in the output rather than systematic causes. A mixture pattern can also occur when output product from several sources (such as parallel machines) is fed into a common stream which is then sampled for process monitoring purposes. A shift in process level. These shifts may result from the introduction of new workers; changes in methods, raw materials, or machines; a change in the inspection method or standards; or a change in either the skill, attentiveness, or motivation of the operators. Sometimes an improvement in process performance is noted following introduction of a control chart program, simply because of motivational factors influencing the workers. A trend, or continuous movement in one direction. Trends are usually due to a gradual wearing out or deterioration of a tool or some other critical process component. In chemical processes they often occur because of settling or separation of the components of a mixture. They can also result from human causes, such as operator fatigue or the presence of supervision. Finally, trends can result from seasonal influences, such as temperature. When trends are due to tool wear or other systematic causes of deterioration, this may be directly incorporated into the control chart model. A device useful for monitoring and analyzing processes with trends is the regression control chart, 92 Mandel (1969). The modified control chart, discussed in Chapter 9, is also used when the process exhibits tool wear. In interpreting patterns on the andR charts, one should consider the two charts jointly. If the underlying distribution is normal, then the random variables and R computed from the same sample are statistically independent. Therefore, and R should behave independently on the control chart. If there is correlation between the and R values—that is, if the points on the two charts “follow” each other—then this indicates that the underlying distribution is skewed. If specifications have been determined assuming normality, then those analyses may be in error. Example 3 With reference to Example 1, how is a range chart or R-Chart constructed? Solution The first step is to find the mean range, . This is done by initially finding the range, R as found on the last column of the solution to Example 1. The mean range, , is 3.33. Found by (2 + 4 + 3 + 3 + 5 + 3)/6 = 20/6 = 3.33. Referring to Appendix.forD3 and D4 and a sample size of 5, D3 = 0 and D4 = 2.115. Determining the lower and upper control limits for the range chart: LCL = D3 = (0) (3.33) = 0 UCL = D4 = 2.115(3.33) 93 = 7.043 UCL = 7.043 CL = 3.33 Figure 4 Example 4 A quality control inspector at the Cocoa Fizz soft drink company has taken twenty-five samples withfour observations each of the volume of bottles filled. The data and the computed means are shown in the table. If the standard deviation of the bottling operation is 0.14 ounces, use this information to develop control limits of three standard deviations for the bottling operation. Observations Sample Number Average Range (bottle volume in ounces) 1 2 3 4 x R 1 2 15.85 16.12 16.02 16 15.83 15.85 15.93 16.01 15.91 15.99 0.19 0.27 3 16 15.91 15.94 15.83 15.92 0.17 4 16.2 15.85 15.74 15.93 15.93 0.46 5 15.74 15.86 16.21 16.1 15.98 0.47 6 15.94 16.01 16.14 16.03 16.03 0.2 7 15.75 16.21 16.01 15.86 15.96 0.46 8 15.82 15.94 16.02 15.94 15.93 0.2 9 16.04 15.98 15.83 15.98 15.96 0.21 94 10 15.64 15.86 15.94 15.89 15.83 0.3 11 16.11 16 16.01 15.82 15.99 0.29 12 15.72 15.85 16.12 16.15 15.96 0.43 13 15.85 15.76 15.74 15.98 15.83 0.24 14 15.73 15.84 15.96 16.1 15.91 0.37 15 16.2 16.01 16.1 15.89 16.05 0.31 16 16.12 16.08 15.83 15.94 15.99 0.29 17 16.01 15.93 15.81 15.68 15.86 0.33 18 15.78 16.04 16.11 16.12 16.01 0.34 19 15.84 15.92 16.05 16.12 15.98 0.28 20 15.92 16.09 16.12 15.93 16.02 0.2 21 16.11 16.02 16 15.88 16 0.23 22 15.98 15.82 15.89 15.89 15.9 0.16 23 16.05 15.73 15.73 15.93 15.86 0.32 24 16.01 16.01 15.89 15.86 15.94 0.15 25 16.08 15.78 15.92 15.98 15.94 0.3 398.75 7.17 Total Central line = D2 = 2.059(0.14 ounces) = 0.2882 ounces Alternatively, we can use the mean range to arrive at this same figure, thus; Central line . = = . . ⋯ . = 0.2882 Upper Control Limit (UCL ) = D R = 2.282(0.2882) = 0.6577 95 Lower Control Limit (LCL ) = D R = 0(0.2882) =0 UCL = 0.6577 C L = 0.2887 Figure 5 Example 4.5 A production manager at a tire manufacturing plant has inspected the number of defective tires in twenty random samples with twenty observations each. Following are the number of defective tires found in each sample: Construct a three-sigma control chart with this information. 96 Number Sample Number of of Observations Defective Sampled Tires 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 3 2 1 2 1 3 3 2 1 2 3 2 2 1 1 2 4 3 1 1 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 + ̅ = ̅ = . . + . … …+ . +3 = 0.1 + 3 97 Defective 0.15 0.1 0.05 0.1 0.05 0.15 0.15 0.1 0.05 0.1 0.15 0.1 0.1 0.05 0.05 0.1 0.2 0.15 0.05 0.05 2.00 = 0.10 UCL = Fraction ̅ ( . )( . ) = 0.10 +3(.067) = 0.301 LCL = ̅ −3 = 0.1 - 3 ( . )( . ) = -0.799 ≈ 0 UCL = 0.30 CL = 0.1 Figure 6 Example 4.6 Frozen orange juice concentrate is packed in 6-oz cardboard cans. These cans are formed on a machine by spinning them from cardboard stock and attaching a metal bottom panel. By inspection of a can, we may determine whether, when filled, it could possibly leak either on the side seam or around the bottom joint. Such a nonconforming can has an improper seal on either the side seam or the bottom panel. Set up a control chart to improve the fraction of non con- forming cans produced by this machine. 30 samples of n = 50 cans each were selected at half-hour intervals over a three-shift period in which the machine was in continuous operation. The data are shown in Table 7 98 Table 7 Data for Trial Control Limits, Sample Size n = 50 Number of Number of Sample Fraction Nonconforming Observations Nonconforming, Cans, Di Sampled pi 1 12 50 0.24 2 15 50 0.3 3 8 50 0.16 4 10 50 0.2 5 4 50 0.08 6 7 50 0.14 7 16 50 0.32 8 9 50 0.18 9 14 50 0.28 10 10 50 0.2 11 5 50 0.1 12 6 50 0.12 13 17 50 0.34 14 12 50 0.24 15 22 50 0.44 16 8 50 0.16 17 10 50 0.2 18 5 50 0.1 19 13 50 0.26 20 11 50 0.22 21 20 50 0.4 22 18 50 0.36 23 24 50 0.48 24 15 50 0.3 25 9 50 0.18 26 12 50 0.24 27 7 50 0.14 28 13 50 0.26 29 9 50 0.18 30 6 50 0.12 Sample Number 347 6.94 99 ̅ = ̅ = . + . + . … …+ . = 0.2313 UCL = ̅ +3 = 0.23 + 3 ( . )( . ) = 0.23 +3(.077) = 0.4102 LCL = −3 = 0.23 - 3 ̅ ( . )( . ) = 0.0529 The control chart with center line at and the above upper and lower control limits is shown in Fig. 7.. The sample fraction nonconforming from each preliminary sample is plotted on this chart. We note that two points, those from samples 15 and 23, plot above the upper control limit, so the process is not in control. These points must be investigated to see whether an assignable cause can be determined. 100 UCL = 0.41 CL = 0.23 Figure 7 TRY Question 1 Ten samples, each of 50 items, were taken from a given production process and the number of defectives in each Sample was recorded as follows: 1, 1, 1, 2, 0, 4, 2, 0, 2, 2, Draw the control chart for fraction defective and plat the points on it. Comment regarding the state of the process. Question 2 A typist has been given a new electric typewriter in place of the old manual one After a week’s time the typist while reading proofs find that the number of errors on the last 10 consecutive pages has been 7, 9, 3, 5, 5, 6, 3, 1, 0, and 0. With joy the typist announces, “I have got my typing under control again. My object was to get my average errors per page down to zero and I have done it”, Comment. 101 Question 3 Construct an X – chart and R –chart from the following information and state whether the concerning process is in control. Sample 1 2 3 4 5 6 7 8 9 10 X 20 34 45 39 26 29 12 34 37 23 R 23 39 14 5 20 17 21 11 40 10 Question 4 A machine is set to deliver packets of given weights. Ten samples of size 5 each were recorded. Below are the given relevant data. Sample 1 2 3 4 5 6 7 8 9 10 Mean( X ) 15 17 15 18 17 14 18 15 17 16 Range(R) 7 7 4 9 8 7 12 4 11 5 Calculate the values of the central line and the control limits for mean chart and the range chart and then comment on the state of control. Conversion factors for n = 5 are A2=0.58 and D4=2.115. 102 Question 5 Six samples of emergency room bills were selected. The data are shown here. Construct and analyze an chart and an R chart for the data. Sample Emergency room bill ($) 1 82 95 86 97 93 2 84 90 99 110 116 3 53 62 43 55 4 97 89 90 100 102 5 88 84 87 87 82 6 91 93 95 99 86 58 Question 6 Five samples of shaft for miniature motors were selected and their diameters measured. The data (in inches) are shown here. Construct and analyze an chart and an R chart for them. Sample Shaft diameters (inches) 1 1.56 1.54 1.55 1.59 2 1.52 1.55 1.50 1.56 3 1.49 1.48 1.51 1.50 4 1.43 1.50 1.56 1.51 5 1.51 1.56 1.49 1.52 103 Question 7 Six main suits are checked for defects and the materials and seams. The number of defects for each is shown here. Construct and analyze a ̅ chart for the data. Suit 1 2 3 4 5 6 No. of Defects 3 6 9 5 7 8 Question 8 Eight samples of water pumps are selected and tested for leaks. Those that leaked are considered defective. Construct and analyze a ̅ chart the data shown here. Sample Size Number of defective cans. 1 10 6 2 10 0 3 10 2 4 10 1 5 10 3 6 10 2 7 10 1 8 10 0 104 APPENDIX 1 (STATISTICAL TABLES) A. STANDARD NORMAL DISTRIBUTION: Table Values Represent AREA to the LEFT of the Z score. Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 0.0 .50000 .50399 .50798 .51197 .51595 .51994 .52392 .52790 .53188 .53586 0.1 .53983 .54380 .54776 .55172 .55567 .55962 .56356 .56749 .57142 .57535 0.2 .57926 .58317 .58706 .59095 .59483 .59871 .60257 .60642 .61026 .61409 0.3 .61791 .62172 .62552 .62930 .63307 .63683 .64058 .64431 .64803 .65173 0.4 .65542 .65910 .66276 .66640 .67003 .67364 .67724 .68082 .68439 .68793 0.5 .69146 .69497 .69847 .70194 .70540 .70884 .71226 .71566 .71904 .72240 0.6 .72575 .72907 .73237 .73565 .73891 .74215 .74537 .74857 .75175 .75490 0.7 .75804 .76115 .76424 .76730 .77035 .77337 .77637 .77935 .78230 .78524 0.8 .78814 .79103 .79389 .79673 .79955 .80234 .80511 .80785 .81057 .81327 0.9 .81594 .81859 .82121 .82381 .82639 .82894 .83147 .83398 .83646 .83891 1.0 .84134 .84375 .84614 .84849 .85083 .85314 .85543 .85769 .85993 .86214 1.1 .86433 .86650 .86864 .87076 .87286 .87493 .87698 .87900 .88100 .88298 1.2 .88493 .88686 .88877 .89065 .89251 .89435 .89617 .89796 .89973 .90147 1.3 .90320 .90490 .90658 .90824 .90988 .91149 .91309 .91466 .91621 .91774 1.4 .91924 .92073 .92220 .92364 .92507 .92647 .92785 .92922 .93056 .93189 1.5 .93319 .93448 .93574 .93699 .93822 .93943 .94062 .94179 .94295 .94408 1.6 .94520 .94630 .94738 .94845 .94950 .95053 .95154 .95254 .95352 .95449 1.7 .95543 .95637 .95728 .95818 .95907 .95994 .96080 .96164 .96246 .96327 1.8 .96407 .96485 .96562 .96638 .96712 .96784 .96856 .96926 .96995 .97062 1.9 .97128 .97193 .97257 .97320 .97381 .97441 .97500 .97558 .97615 .97670 2.0 .97725 .97778 .97831 .97882 .97932 .97982 .98030 .98077 .98124 .98169 2.1 .98214 .98257 .98300 .98341 .98382 .98422 .98461 .98500 .98537 .98574 2.2 .98610 .98645 .98679 .98713 .98745 .98778 .98809 .98840 .98870 .98899 2.3 .98928 .98956 .98983 .99010 .99036 .99061 .99086 .99111 .99134 .99158 2.4 .99180 .99202 .99224 .99245 .99266 .99286 .99305 .99324 .99343 .99361 2.5 .99379 .99396 .99413 .99430 .99446 .99461 .99477 .99492 .99506 .99520 2.6 .99534 .99547 .99560 .99573 .99585 .99598 .99609 .99621 .99632 .99643 2.7 .99653 .99664 .99674 .99683 .99693 .99702 .99711 .99720 .99728 .99736 2.8 .99744 .99752 .99760 .99767 .99774 .99781 .99788 .99795 .99801 .99807 2.9 .99813 .99819 .99825 .99831 .99836 .99841 .99846 .99851 .99856 .99861 3.0 .99865 .99869 .99874 .99878 .99882 .99886 .99889 .99893 .99896 .99900 3.1 .99903 .99906 .99910 .99913 .99916 .99918 .99921 .99924 .99926 .99929 3.2 .99931 .99934 .99936 .99938 .99940 .99942 .99944 .99946 .99948 .99950 3.3 .99952 .99953 .99955 .99957 .99958 .99960 .99961 .99962 .99964 .99965 3.4 .99966 .99968 .99969 .99970 .99971 .99972 .99973 .99974 .99975 .99976 3.5 .99977 .99978 .99978 .99979 .99980 .99981 .99981 .99982 .99983 .99983 3.6 .99984 .99985 .99985 .99986 .99986 .99987 .99987 .99988 .99988 .99989 3.7 .99989 .99990 .99990 .99990 .99991 .99991 .99992 .99992 .99992 .99992 3.8 .99993 .99993 .99993 .99994 .99994 .99994 .99994 .99995 .99995 .99995 3.9 .99995 .99995 .99996 .99996 .99996 .99996 .99996 .99996 .99997 .99997 105 B. t-DISTRIBUTION TABLE t distribution critical values Upper-tail probability p df. 25 . 20 . 15 .10 .05 1 1.000 1.376 1.963 3.078 6.314 2 0.816 1.061 1.386 1.886 2.920 3 0.765 0.978 1.250 1.638 2.353 4 0.741 0.941 1.190 1.533 2.132 5 0.727 0.920 1.156 1.476 2.015 6 0.718 0.906 1.134 1.440 1.943 7 0.711 0.896 1.119 1.415 1.895 8 0.706 0.889 1.108 1.397 1.860 9 0.703 0.883 1.100 1.383 1.833 10 0.700 0.879 1.093 1.372 1.812 11 0.697 0.876 1.088 1.363 1.796 12 0.695 0.873 1.083 1.356 1.782 13 0.694 0.870 1.079 1.350 1.771 14 0.692 0.868 1.076 1.345 1.761 15 0.691 0.866 1.074 1.341 1.753 16 0.690 0.865 1.071 1.337 1.746 17 0.689 0.863 1.069 1.333 1.740 18 0.688 0.862 1.067 1.330 1.734 19 0.688 0.861 1.066 1.328 1.729 20 0.687 0.860 1.064 1.325 1.725 21 0.686 0.859 1.063 1.323 1.721 22 0.686 0.858 1.061 1.321 1.717 23 0.685 0.858 1.060 1.319 1.714 24 0.685 0.857 1.059 1.318 1.711 25 0.684 0.856 1.058 1.316 1.708 26 0.684 0.856 1.058 1.315 1.706 27 0.684 0.855 1.057 1.314 1.703 28 0.683 0.855 1.056 1.313 1.701 29 0.683 0.854 1.055 1.311 1.699 30 0.683 0.854 1.055 1.310 1.697 40 0.681 0.851 1.050 1.303 1.684 50 0.679 0.849 1.047 1.299 1.676 60 0.679 0.848 1.045 1.296 1.671 80 0.678 0.846 1.043 1.292 1.664 100 0.677 0.845 1.042 1.290 1.660 1000 0.675 0.842 1.037 1.282 1.646 z∗ 0.674 0.841 1.036 1.282 1.645 50% 60% 70% 80% 90% Confidence level C .025 12.71 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.021 2.009 2.000 1.990 1.984 1.962 1.960 95% .02 .01 . 005 .0025 . 001 . 15.89 31.82 63.66 127.3 318.3 4.849 6.965 9.925 14.09 22.33 3.482 4.541 5.841 7.453 10.21 2.999 3.747 4.604 5.598 7.173 2.757 3.365 4.032 4.773 5.893 2.612 3.143 3.707 4.317 5.208 2.517 2.998 3.499 4.029 4.785 2.449 2.896 3.355 3.833 4.501 2.398 2.821 3.250 3.690 4.297 2.359 2.764 3.169 3.581 4.144 2.328 2.718 3.106 3.497 4.025 2.303 2.681 3.055 3.428 3.930 2.282 2.650 3.012 3.372 3.852 2.264 2.624 2.977 3.326 3.787 2.249 2.602 2.947 3.286 3.733 2.235 2.583 2.921 3.252 3.686 2.224 2.567 2.898 3.222 3.646 2.214 2.552 2.878 3.197 3.611 2.205 2.539 2.861 3.174 3.579 2.197 2.528 2.845 3.153 3.552 2.189 2.518 2.831 3.135 3.527 2.183 2.508 2.819 3.119 3.505 2.177 2.500 2.807 3.104 3.485 2.172 2.492 2.797 3.091 3.467 2.167 2.485 2.787 3.078 3.450 2.162 2.479 2.779 3.067 3.435 2.158 2.473 2.771 3.057 3.421 2.154 2.467 2.763 3.047 3.408 2.150 2.462 2.756 3.038 3.396 2.147 2.457 2.750 3.030 3.385 2.123 2.423 2.704 2.971 3.307 2.109 2.403 2.678 2.937 3.261 2.099 2.390 2.660 2.915 3.232 2.088 2.374 2.639 2.887 3.195 2.081 2.364 2.626 2.871 3.174 2.056 2.330 2.581 2.813 3.098 2.054 2.326 2.576 2.807 3.091 96% 98% 99% 99.5% 99.8% 106 0005 636.6 31.60 12.92 8.610 6.869 5.959 5.408 5.041 4.781 4.587 4.437 4.318 4.221 4.140 4.073 4.015 3.965 3.922 3.883 3.850 3.819 3.792 3.768 3.745 3.725 3.707 3.690 3.674 3.659 3.646 3.551 3.496 3.460 3.416 3.390 3.300 3.291 99.9% C. Table of Control Chart Constants X- and R-Charts X- and S-Charts ____________________________ ____________________________ n d2 d3 C4 A2 D3 D4 A3 B3 B4 2 1.128 0.8525 0.7979 1.880 — 3.267 2.659 — 3.267 3 1.693 0.8884 0.8862 1.023 — 2.574 1.954 — 2.568 4 2.059 0.8798 0.9213 0.729 — 2.282 1.628 — 2.266 5 2.326 0.8798 0.9400 0.577 — 2.114 1.427 — 2.089 6 2.534 0.8480 0.9515 0.483 — 2.004 1.287 0.030 1.970 7 2.704 0.8332 0.9594 0.419 0.076 1.924 1.182 0.118 1.882 8 2.847 0.8198 0.9650 0.373 0.136 1.864 1.099 0.185 1.815 9 2.970 0.8078 0.9693 0.337 0.184 1.816 1.032 0.239 1.761 10 3.078 0.7971 0.9727 0.308 0.223 1.777 0.975 0.284 1.716 11 3.173 0.7873 0.9754 0.285 0.256 1.744 0.927 0.321 1.679 12 3.258 0.7785 0.9776 0.266 0.283 1.717 0.886 0.354 1.646 13 3.336 0.7704 0.9794 0.249 0.307 1.693 0.850 0.382 1.618 14 3.407 0.7630 0.9810 0.235 0.328 1.672 0.817 0.406 1.594 15 3.472 0.7562 0.9823 0.223 0.347 1.653 0.789 0.428 1.572 16 3.532 0.7499 0.9835 0.212 0.363 1.637 0.763 0.448 1.552 17 3.588 0.7441 0.9845 0.203 0.378 1.662 0.739 0.466 1.534 18 3.640 0.7386 0.9854 0.194 0.391 1.607 0.718 0.482 1.518 19 3.689 0.7335 0.9862 0.187 0.403 1.597 0.698 0.497 1.503 20 3.735 0.7287 0.9869 0.180 0.415 1.585 0.680 0.510 1.490 21 3.778 0.7272 0.9876 0.173 0.425 1.575 0.663 0.523 1.477 22 3.819 0.7199 0.9882 0.167 0.434 1.566 0.647 0.534 1.466 23 3.858 0.1759 0.9887 0.162 0.443 1.557 0.633 0.545 1.455 24 3.895 0.7121 0.9892 0.157 0.451 1.548 0.619 0.555 1.445 25 3.931 0.7084 0.9896 0.153 0.459 1.541 0.606 0.565 1.435 107

Random Variables: Definition, Distribution, Expectation, Variance

Related documents

Products

Support

Random Variables: Definition, Distribution, Expectation, Variance

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib