Uploaded by Amponsah Joseph

BTEC MECH NOTES FOR PRINTING--

advertisement
CHAPTER 1
RANDOM VARIABLE
INTRODUCTION
Random Engineering quantities whose variations contain elements of chance are called random
variables. E.g. of engineering quantities are Weight, Force, Resistance, Length.
SOME EXAMPLES OF ENGINEERING RANDOM VARIABLES
1. The diameter of a motor shaft with a nominal size 0.2m.
2. The weight of a steel box used to contain engineering tools.
3. The number of components passing a point on a factory’s production line in one minute.
4. The Length of time a machine works without failing.
5. The nominal resistance value of resistors.
6. The length of a bridge.
7. The force required to stretch a specific length of metal.
8. The number of bits of a computer.
CLASSIFICATION OF RANDOM VARIABLES
Random variables can be classified into two; Discrete and Continuous Random Variables.
All the quantities mentioned in Section 1.2 vary. In 1, 2 and 4, the values vary continuously. That
is they can assume any value within some range. For example, a diameter shaft may have any
diameter between 0.197 and 0.203. A steel box can have any weight between 0.345kg and 0.352kg.
These are examples of continuous variables. The value may be measured to a certain accuracy
which depends on the measuring device which may be used. The shaft diameter may be measured
to the nearest tenth of a millimeter.
In 3 and 5, the variables can assume only a limited number of values. For example, the number of
components passing a point will be a nonnegative integer 0, 1, 2, . . . The nominal resistance value
1
of a resistor has a limited number of values which is specified by the manufacturer in its catalogue.
Variables such as these which can assume only a limited set of values are called discrete variables.
Definition 1 (Random Variables)
Let S be the sample space associated with some experiment  . A random variable X is a function
that assigns a real number S (s) to each sample point s  S .
Consider an experiment of tossing two fair coins simultaneously. Let the number of tails that show up be
defined by the random variable X. The sample points are therefore defined by the values of the random
variable as below:
Sample Points
HH
HT
TH
TT
x
0
1
1
2
The random variable X which defines the number of tails for the experiment has the values 0, 1,
and 2.
PROBABILITY DISTRIBUTIONS
Discrete Random Variables
A probability distribution of a discrete random variable X is a sequence of the values xi (i=1, 2, .
. .) of X, together with a probability assigned to each point xi ; i=1, 2, . . .
The value of xi may be either finite or countably infinite and in any order, though for convenience it
should be in increasing order of magnitude. The probability distribution of a discrete random variable is
more often called a probability function (pf), or a probability mass function (pmf), and is denoted by
p( xi ) or P( X  xi ) . It is a probability that the random variable X assumes a certain value.
Representation of probability distribution of X.
2
xi
x1
x2
x3
…
…
xn
p( xi )
p( x1 )
p( x2 )
p ( x3 )
…
…
p( x n )
For a discrete random variable, the sum of the probabilities is 1

i.e  P( X  xi )  1 .
i 1
Properties of Probability mass Function
p( xi )  0 for all x
1.

2.
 p( x )  1 where the sum is over all x.
i
i 1
Example 1
A discrete function is given by
1
 (2 x  3),
p( x)   21
0,
x  1, 2, 3
otherwise
Verify that it is a probability function of some variable.
Solution
Clearly p( xi )  0
and
1 3
1
1
( 2 x  3)  [(2  3)  ( 4  3)  (6  3)]  (5  7  9)  1

21 x 1
21
21
Hence the function is a pmf.
Example 2
A discrete random variable has a pmf
k ( x  1),
p( x)  
0,
x  3, 4, 5
otherwise
3
Find the constant k for which p(x0 is a probability function.
Since the function is a pmf,
5
k  ( x  1) k [(3  1)  (4  1)  (5  1)]
x 3
 9k  1
k
From which
1
9
Continuous Random Variables
The probability distribution of a continuums random variable is more often called a probability
density function (pdf), or simply density function and is denoted by f(x).
Properties of Probability Density Function
1.
2.
f ( x )  0 for all x



f ( x)dx  1
Example 3
Let X be a continuous random variable such that
1
 x,
f ( x)   8
0,
0 x4
elsewhere
(a) Show that f(x) is a pdf.
(b) Sketch the graph
Solution
(a) It is clear that f ( x )  0 so property 1 is satisfied. For f(x) to be pdf, it must also satisfy
property 2. That is
4

4
0
4
1
1
1 
xdx  x 2   (16)  1
8
16  0 16
Hence f(x) is a pdf.
(b)
Example 4
A random variable X has the pdf
kx,
f ( x)  
0,
0x4
elsewhere
where k is a constant.
(a) Find the constant k.
(b) ComputeP(2 <X<3).
Solution
(a) Since the function is pdf,
4
k
k 2 
0 kxdx  2 x  0  2 (16)  1
1
k 
8
4
3
1
1
1
5
 1 2
(9)  (4) 
(b) P(2 <X<3)=  xdx   x  
2 8
16
16
16  2 16
3
5
CUMMULATIVE DISTRIBUTION FUNCTIONS
If X is a random variable and x any real number, the cumulative distribution function (cdf) of X is a
function F defined as the probability that the random variable X takes a value less than or equal to x.
i.e F ( x )  P( X  x)
or
F ( x)  P(  X  x)
Discrete Random Variable
Let X be a discrete random variable with probability p(x), then the cumulative distribution function is
defined as
F ( x)   p ( x)
xi  x
0,
 p( x )
 1
 p ( x1 )  p ( x 2 )

i.e F ( x)  .
.

.
 p ( x )  p ( x )  ...  p ( x )  1
2
n
 1
   x  x1
x1  x  x 2
x 2  x  x3
x n  x  
Continuous Random Variable
Let X be a continuous random variable with probability f(x), then the cumulative distribution function is
defined as
x
F ( x)   f (t )dt

i.e
If we define f(x) over a  x  b , then
0,
 x
F ( x )   f (t )dt
a

1
xa
a xb
xb
6
EXPECTATION OF X
Discrete Random Variables
The expectation of X (expected value or mean), written E(X), is given by


E( X )   xi P( X  xi ) or E( X )   xi p( xii )
i 1
i 1
This can simply be expressed in the form
( )=
= 1,2, … ,
= ( )
Continuous Random Variables
The expectation of X (expected value or mean), written E(X), is given by

E ( X )   xf ( x)dx

VARIANCE OF X
For a discrete or continuous random variable, X, with
follows:
= ( ), the variance is defined as
The variance of X is written as Var(X) which is given by
( )= ( − )
( ) =
( − )
= (
−2
+
= (
)−2
+
= (
)−
( )= (
)−
)
7
We may write Var ( X )  E ( X 2 )  [ E ( X )]2
=
( )=
Example 5
X is the random variable ‘the number on a biased die’, and the p.d.f of X is as shown.
x
( = )
1
1
6
2
1
6
Find (a) the value of y, (b) E(X) (c) (
e. P(X = 1)
f. P(X > 2)
3
1
5
4
5
1
5
6
1
6
) (d) Var(X)
g. P(X ≥ 3)
h.
(1 <
<5)
Solution

 p( x )  1 ; I = 1, 2, …,6
a.
i
i 1
1 1 1
+ + +
6 6 5
1
=
10
X
( = )
1 1
+ + =1
5 6
1
2
3
4
5
6
1
6
1
6
1
5
1
10
1
5
1
6
b. ( ) = ∑
1
1
1
1
1
1
1
+2
+3
+4
+5
+6
= 3.5
6
6
5
10
5
6
c. (
)=1
+2
+3
+4
+5
8
+6
= 15
d. Var ( X )  E ( X 2 )  [ E ( X )]2
= 15
7
− 3.5
30
e. P(X = 1) =
f. P(X > 2) = P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6)
=
1 1 1 1 2
+
+ + =
5 10 5 6 3
g. P(X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) + P(X = 6)
1 1 1 1
= +
+ +
5 10 5 6
h. (1 <
< 5 ) = P(X = 2) + P(X = 3) + P(X = 4)
1 1 1
7
= + +
=
6 5 10 15
Example 6
A discrete engineering random variable has a pmf
 1 ( x  2)
P ( X  x)   3
0,
x  2, 3, 4,
otherwise
Calculate the expectation and variance of X.
Solution
4
4
E ( X )   xP( X  x)   x 13 ( x  2) 
x 2
x 2
4
1
3
 (x
x2
2
1
 2 x)  ](2 2  2(2))  (3 2  2(3))  (4 2  2(4)]
3
 3 23
Please try to find the Variance of X.
9
Example 7
Refer to Example 4. Find E(X) and Var(X).
Solution
4

4

0
E ( X )   xf ( x)dx  
1
1 4
1  x3 
1  64 0  8
x xdx   x 2 dx        
8
8 0
8  3 0 8  3 3 3
Var( X )  E ( X 2 )  [ E ( X )]2 , so we need to find E ( X 2 ) .
4

2
2
E ( X )   x f ( x)dx  

4
0
1
1 4
1  x4 
1  256 0 
x xdx   x 3 dx     
  8
8
8 0
8  4 0 8  4
4
2
2
8
8
Therefore Var( X )  E( X )  [ E ( X )]  8    
9
3
2
2
Properties Expectations and Variance of independent random variables
1.
2.
3.
4.
5.
6.
(
(
(
(
(
(
± ) = ( )± ( )
± )=
( )±
)
( )+
±
=
)=0
)=
( )
+ )=
( )
( )
Example 8
Independent random variables X and Y are such that E(X)= 4 , E(Y)= 5 , Var(X)= 1, Var(Y)= 2.
Find
a. E(4X + 2Y)
b. Var(3X + 2Y)
Solution
a.
(4
+ 2 ) = 4 ( ) + 2 ( ) = 26
10
(3
b.
+ 2 )=3
( )+2
( ) = 17
SOME DISCRETE PROBABILITY DISTRIBUTION
Bernoulli process
( )=
Mean
= ( )=
(1 − )
=
, =0
1
0<
< 1;
=1−
( ) = (1 − ) =
The random variable of this experiment is a binary variable which assumes the value;0 and 1.
If
=0⟹
( = 0) =
(1 − )
=
( = 1) =
(1 − )
=
If x=1 ⟹
Binomial distribution
Binomial experiment is the generalization of the Bernoulli trials.
Conditions for a binomial distribution
1. A finite number, n trials are carried out.
2. The trials are independent and identical.
An independent event is an event in which the occurrence of one does not affect the other.
e.g.; tossing a coin.
3. The outcome of each trial is either a success or a failure.
4. The probability
of a successful outcome is the same for
11
each trial.
~ ( , )
~
( , )
Note:
The number of trials,
and the probability of success,
are both needed to describe the
distribution completely. They are known as the parameters of the binomial distribution.
If ~ ( , ), the probability of obtaining r success in n trials is ( = ) where
( = )=
=
!
! ( − )!
= 0,1,2,3, … ,
; ! = ( − 1)( − 2)( − 3) … 3 ∙ 2 ∙ 1
n = number of trials.
p = probability of success on a single trial
r = number of success in n trials.
Mean; μ = np Variance; σ = npq standard deviation; σ =
npq
Example 9
1. The random variable X is distributed (7,0.2). Find, correct to 3 decimal places
a. P(X=3)
b.
(1 <
≤ 4)
c. P(X>1)
Solution
12
P=0.2 q=0.8 n=7
a.
( = 3) =
= 0.115
b.
(1 <
≤ 4) = ( = 2) + ( = 3) + ( = 4)
=
7
2
+
7
3
+
7
4
= 0.419
c.
( > 1) = 1 − ( ≤ 1)
= ( = 2) + ( = 3) + ⋯ + ( = 7)
= 1 − [ ( = 0) + ( = 1)]
=1−(
+
7
1
)
= 0.423
Example 10
A box contains a large number of spare parts. The probability that a spare part is faulty is 0.1. How
many of the parts would you need to select to be more than 95% certain of picking at least one
faulty one?
Solution
P=0.1 q=0.9 n=?
( ≥ 1) > 0.95
( ≥ 1) = 1 − ( < 1)
= 1 − ( = 0)
13
=1−
= 1 − 0.9
1 − 0.9 > 0.95
0.05 > 0.9
0.9 < 0.05
Take log of both sides.
nlog0.9 < log0.05
n> 28.4
n≈ 29
You need to select at least 29 pens.
Example 11
Suppose that a consignment of 300 electrical fuses contains 5% defectives. If a random sample of
ten fuses is selected and tested, find the probability of observing at least three defectives.
Solution
X~B(10,0.05)
( ≥ 3) = 1 − ( < 3)
1 − ( ≤ 2)
= 1 − [ ( = 0) + ( = 1) + ( = 2)]
= 1 − 0.95
+
10
10
∙ 0.95 ∙ 0.05 +
∙ 0.95 ∙ 0.05
1
2
= 0.0115
Example 12
14
30% of pupils in a school travel to school by bus. From a sample of 10 pupils chosen at random,
find the probability that;
a. Only three travel by bus.
b. Less than half travel by bus.
Solution
P=0.3 q=0.7 n=10
a.
~ (10,0.3)
( = 3) =
= 0.267
b.
( < 5) = ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4)
+
10
1
+
10
2
+
10
3
+
10
4
= 0.850
Example 13
In a survey on washing powder, it is found that the probability that a shopper chooses soapysuds
is 0.25. Find the probability that in a random sample on 9 shoppers.
a. Exactly 3 choose soapysuds.
b. More than 7 choose soapysuds.
Solution
P=0.25 q=0.75 n=9
a.
( = 3) =
= 0.234
b. ( > 7) = ( = 8) + ( = 9)
15
=
9
8
+
= 0.000107
Example 14
A bag contains screw drivers of which 40% have red handles and the rest yellow. A screw driver
is taken from the bag, its handle color noted and then replaced. This is performed eight times in
all. Calculate the probability that;
a. exactly three will be red.
b. at least one will be red.
c. more than four will be yellow.
Solution
n=8 p=0.4 q=0.6
a.
( = 3) =
= 0.279
b.
( ≥ 1) = 1 − ( < 1)
= 1 − ( = 0)
=1−
=
= 0.983
c. n=8 p=0.6 q=0.4
( > 4) = 1 − ( ≤ 4)
= 1 − ( < 5)
= ( = 5) + ( = 6) + ( = 7) + ( = 8)
16
=
8
5
+
8
6
+
8
7
+
= 0.594
Example 15
(4, )
( = 4) = 0.0256
Find P(X=2)
Solution
( = 4) =
0.0256 =
4
4
4
=
0.0256 =
= 0.4
⇒ ( = 2) =
4
0.6 0.4
2
Example 16
A multiple choice test contains 25 questions, each with four answers. Assume a student just
guesses on each question.
a. What is the probability that the student answers more than 20 questions correctly?
b. What is the probability that the student answers less than 5 questions correctly?
17
Solution
n=25
p=0.25 q=0.75
a.
( > 20) = ( = 21) + ( = 22) + ( = 23) + ( = 24) + ( = 25)
=
25
21
25
22
+
9
+
25
23
25
24
+
+
=0
Example 17
Suppose having an experiment of two outcomes. The probability of success is and probability
of failure is . What is the probability of having one success obtained in two trials?
Solution
Outcome
SS
SF
FS
FF
[
Probability
1/9
2/9
2/9
4/9
]= [
1
But using binomial theorem
From the example
= 2,
]=2 9+2 9=4 9
= success,
where
= ,
,
= failure and
=
1
3
2
3
=2×
Example 18
18
1 2 4
× =
3 3 9
+
=1
In a game of chance, you play by rolling a fair die four times and you count the number of results
which are 6s.
i.
ii.
What is the probability that in one play of the game you obtain exactly three 6s?
What is the probability that in one of the game you obtain exactly two 5s
solution
i.
[ = 3] =
=
ii.
[ = 2] =
=
Example 19
Of the telephone calls received by an airline reservation agent,60% requests for information and
40%are to make reservations. Assume the calls can be viewed as Bernoulli trials with success
defines to be call for a reservation. Six calls were reserved.
1. What is the probability that exactly 2 calls are for reservation?
2. What is the probability that at least 4 are for information?
Solution
n=6
p=0.4
1. ( = 2) =
n=6
p=0.6
q=0.6
(0.4) (0.6) =
q=0.4
2. ( ≥ 4) = ( = 4) + ( = 5) + ( = 6)
=
(0.6) (0.4) +
(0.6) (0.4) +
19
(0.6) (0.4) = 0.1792
Example 20
The probability that it will be a fine day is 0.4. Find the expected number of fine days in week
and also the standard deviation.
Solution
The expected number of five days = ( ) =
Standard deviation of =
= 7 × 0.4 = 2.8
( ) =√7 × 0.4 × 0.6 = 1.3 days
Question 1
A used car sales woman estimates that each times she shows a customer a car, there is a
probability 0.1 that the customer will buy the car. The sales woman would like to sell at least one
car per week. If showing a car is a Bernoulli trial how many cars would the saleswoman show
per week so that the probability is 0.95 of at least one sale?
Ans 29
Question 2
The random variable X is ( , 0.3) and ( ) =2.4 find n and standard deviation of x?
Ans n= 8, s.d=1.3
Question 3
In a group of people the expected number who wear glasses is 2 and the variance is 1.6 find the
probability that
a. A person chosen at random from the group wear glasses.
b. 6 people in the group wear glasses.
20
Poisson distribution
Conditions for a Poisson model
1. Events occur singly and at random in a given interval of time or space.
2. The parameter ;
> 0 is the mean number of occurrences in the given interval, is known
and is finite(i.e the occurrence rate per unit).
The variable X is the number of occurrences in the given interval.
~
( = )=
!
( ) =
=
= 0,1,2,3, … ∞
Typical examples of random variables for which the Poisson probability distribution provides a
good model are.
1. The number of traffic accidents per month at a busy intersection.
2. The number of death claims received per day by an insurance company.
3. The number of unscheduled admissions per day to a hospital.
Poisson distribution is used to model the occurrence of a random event that happens in some time
periods.
Example 21
A student finds that the average number of amoebas in 10ml of pond water from a particular pond
is 4. Assuming that the number of amoebas follow a Poisson distribution, find the probability that
in a 10ml sample
a. There are exactly 5 amoebas
21
b. There are no amoebas.
c. There are fewer than three amoebas.
Solution
X is the number of amoebas in 10ml if pond water, where ~ (4) = 4
a.
( = 5) =
!
= 0.156
b.
( = 0) =
!
= 0.0183
c.
( < 3) = 1 − ( ≥ 3)
= ( = 0) + ( = 1) + ( = 2)
=
4
+
0!
4
+
1!
4
2!
= 0.238
Note
Unit interval
For this example, the mean number of amoebas in 10ml of pond water from a particular period is
four so the number in 10ml is distributed
(4). now suppose you want to find a probability
relating to the number of amoebas in 5ml of water from the same pond. The mean number of
amoebas in 5ml is two, so the number in 5ml is distributed (2).
Example 22
22
On average the school photocopier breaks down eight times during the school week(Mon-Fri).
Assuming that the number of breakdowns can be modeled by a Poisson distribution. Find the
probability that it breaks down.
a. Five times in a given week.
b. Once on Monday.
c. Eight times in a fortnight.
Solution
a. X is the number of breakdowns in a week, where X~ (8)
( = 5) =
8
= 0.0916
5!
b. Let B be the number of breakdowns in a day. The mean number of breakdowns in a day is
= 1.6 so ~ (1.6).
.
( = 1) =
1.6
= 1.6
1!
.
= 0.323
c. Let Y be the number of breakdowns in a fortnight.
The mean number of breakdowns in a fortnight is 16 so ~ (16)
( = 8) =
16
= 0.0120
8!
Example 23
X follows a Poisson distribution with standard deviation 1.5. Find ( ≥ 3).
23
Solution
If
~ ( )
( )=
( )=(
) = 2.25
= 2.25
~ (2.25)
( ≥ 3) = 1 − ( < 3)
= 1 − [ ( = 0) + ( = 1) + ( = 2)]
.
=1−
2.25
+
0!
.
2.25
+
1!
.
2.25
2!
= 0.391
Example 24
An insurance company receives on an average two claims per week from a particular factory.
Assuming that the number of claims can be modeled by a Poisson distribution, find the probability
that it receives
a. 3 claims in a given week.
b. More than four claims in a given week.
c. Four claims in a given fortnight.
d. No claims on a given day, assuming that the factory operates on a five-day week.
Solution
Let X be the number of claims per week, ~ (2)
a.
( = 3) =
!
= 0.180
24
b.
( > 4) = 1 − ( ≤ 4)
= 1 − [ ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4)]
=1−
[1 + 2 +
2
2
2
+ + ]
2! 3! 4!
= 0.053
c. Let Y be the number of claims in a fortnight.
The mean number of claims in a fortnight is 4 so ~
( = 4) =
(4)
4
= 0.195
4!
d. let F be the number of claims in a given day.
The mean number of claims in a day is 0.4. ~ (0.4)
.
( = 0) =
0.4
=
0!
.
= 0.670
Example 25
A sales manager receives six telephone calls on average between 9:30am and 10:30am on a
weekday. Find the probability that.
a. She will receive two or more calls between 9:30am and 10:30am on Tuesday.
25
b. She will receive exactly two calls between 9:30am and 9:40am on Wednesday.
Solution
Let X be the number of telephone calls received by the manager between 9:30am-10:30am.
~ (6)
( ≥ 2) = 1 − ( < 2)
a.
= 1 − [ ( = 0) + ( = 1)]
=1−[
6
6
+
]
0!
1!
[1 + 6]
=1−
= 0.983
b. Let Y be the number of telephone calls received by the manager between 9:30am to 9:40am
on Wednesday.
The mean number of calls received between 9:30am on Wednesday is 1. ~ (1).
( = 2) =
1
2!
= 0.1839
Example 26
The number of bacterial colonies on a petri dish can be modeled by a Poisson distribution with
average number 2.5 per
. Find the probability that
a. In 1
there are no bacterial colonies.
b. In 2
there are more than four bacterial colonies.
c. In 4
there are six bacterial colonies.
Solution
Let X be the number of bacterial colonies on a petri dish. ~ (2.5).
26
.
( = 0) =
a.
.
!
=
.
= 0.082
b. Let Y be the number of bacteria on a 2
petri dish.
The mean number is ~ (5).
( > 4) = 1 − ( ≤ 4)
= 1 − [ ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4)]
=1−
1+5+
5
5
5
+ +
2! 3! 4!
= 0.559
c. Let B be the number of bacteria on 4
P(X = 6) =
e
petri dish. ~ (10)
10
6!
= 0.063
Example 27
Customers walk into a store at an average rate of 20 per hour. Find the probability that.
a. No customer have arrived at the store in 10min.
b. No more than 4 customers have arrived at the store in 30min.
Solution
a. Let X be the number of customers arriving at the store in 30min with mean 10. ~ ( ).
27
( = 0) =
(10⁄3)
= 0.0356
0!
b. Let Y be the number of customers arriving at the store in 30min with mean 10. ~
( ≤ 4) = ( = 0) + ( = 1) + ( = 2) + ( = 3) + ( = 4)
=
1 + 10 +
10
10
10
+
+
2!
3!
4!
= 0.029
Example 28
The average number misprints on each page in the first draft of novel are four. Find the
probability that on a randomly selected double page
a.
There are three misprints on each page.
b. There are six misprints in total.
Solution
=4
4
[ ( = 3)] = [
3!
] = 0.0382
b. There six misprints in total.
=4×2
( = 6) =
8
6!
= 0.122
28
(10)
Using the Poisson distribution as an approximation to the binomial
distribution
When n is large( > 50)and p is small ( < 0.1) the distribution ~ ( , )can be
approximated using a Poisson distribution with same mean ie ~
(
). The approximation
gets better as n gets larger and p gets smaller.
Example 29
Eggs are packed into boxes of 500. On average 0.7% of the eggs are to be broken when the eggs
are unpacked. Find, correct to 2 significant figures the probability that in a box 500 of eggs,
a. exactly three are broken
b. at least two are broken
Solution
= 500
Since n > 50and
Given ~
a.
( )=
= 0.007
= 500 × 0.007 = 3.5
< 0.1, use a Poisson approximation.
(3.5)
( = 3) =
.
.
!
= 0.22
b. ( ≥ 2) = 1 − ( ( = 0) + ( = 1)
=1−(
.
+3
.
)
= 0.86
Example 30
A Christmas draw aims to sell 5000 tickets, 50 of which will win a prize.
29
a. Calculate P(X ≤ 3 )
b. Calculate how many tickets should be bought in order for there to be 90% probability of
winning at least- one prize.
Solution
= 0.01
a. P(a ticket-wins a prize) =
~ (200 , 0.01)
Ans= 0.86
(b)
~ ( , 0.01)
> 50
< 0.1
(
≥ 1 ) = 0.9
(
≥ 1) = 1 − ( = 0 )
.
0.9 = 1 .
(0.01 )
= 0.1
-0.01n = ln(0.1)
=
( . )
.
= 230.25
So the least integer value of n must be 231
= 230
= 230 × 0.01 = 2.3
1−
.
= 231
= 231 × 0.01 = 2.31
1−
.
= 0.8997 < 0.9
= 0.9007 > 0.9
Example 31
X is (250 , ). The value of p is such that it is valid to apply a Poisson approximation. When
this is done, it is found that P(X = 0) = 0.0235. Find the value of p.
Solution
E(X)=np
( = 0) =
(250 )
= 0.0235
0!
30
= 0.0235
=
ln (0.0235)
= 0.0150
−250
The sum of independent Poisson variables
For independent variables, X and Y , if
~
( ) ~
( ) then
+
~
(
+ )
Example 32
Two identical racing cars are being tested on a circuit. For each, the number of mechanical
breakdowns can be modeled by a Poisson distribution with a mean of one breakdown in 100
lags. If a car breaks down it is attended and continues on the circuit. The first car is tested for 20
lags and second car for 40 lags. Find
The probability that the services team is called out to attend to the breakdowns
a. Once
b. more than twice
Solution
~
(0.2)
~
(0.4)
=
+
a. (
= 1) = 0.6
.
b. (
> 2) = 1 −
( = 0) +
,
(0.6)
= 0.329
( = 1) +
= 0.023
SOME CONTINUOUS DISTRIBUTION
31
( = 2)
UNIFORM DISTRIBUTION
The continuous random variable X is said to have the uniform distribution over the interval [a, b]
if the probability density function satisfies
f ( x) 
1
'axb
ba
Question 1
Suppose that buses arrive at bus stop every 15mins and that the waiting time for the bus to arrive
has a uniform probability distribution on the interval from 0 to 15mins
a. What is the probability that X will exceed 10mins?
b. What is the probability that X will be at most 12mins?
Question 2
The probability density function of time X required to complete an assembly operation is
uniformly distributed for 30 ≤
≤ 40sec. Determine the proportion assemblies that require less
than 35sec to complete.
EXPONENTIAL DISTRIBUTION
The exponential distribution or more precisely the negative exponential distribution is used to
model the time required to observe the first occurrence of an object of a specified type when events
of this type are occurring randomly at mean rate  per unit time. Example of such phenomena is the
time until a piece of equipment fails, the time it takes to complete a job etc.
The continuous random variable X is said to have the exponential distribution with positive
parameter  if its probability density function is given by
32
 e x , x  0
f ( x)  
, elsewhere
0
Example 33
Suppose that the time until first failure of an electric component is an exponential random variable
with a rate of   0.0625 per month. Calculate the probability that the device lasts
(a) not more than 45 month
(b) longer than 40 month
Solution
Let X denote the random variable that measures time until first failure.
  0.0625
(a) P ( X  45) 

45
0
0.0625e  0.0625 x dx  1  e ( 0.0625 )( 45)  0.9399
(b) P ( X  40)  1  P ( X  40)  1 

40
0
0.0625e 0.0625 x dx  1  (1  e ( 0.0625 )( 40 ) )  0.0821
NOTE:
1. The cumulative distribution function of the exponential distribution is given by
F ( x)  P( X  x)  1  e x ,
x0
2. If X has an exponential distribution with parameter  , then
E( X ) 
1

and Var ( X ) 
1
2
3. If X has the exponential distribution over the interval [a, b] , then
P(a  X )  e a  e b
Example 34
33
Suppose particles arrive independently at a counter at an average rate of three per second. What is
the probability that a particle will arrive
(a) Within one second
(b) after two seconds
Solution
 2
(a) P( X  1)  F (1)  1  e 2(1)  0.9502
(b) P( X  2)  1  F (2)  1  (1  e 2 ( 2 ) )  0.0025
Question 1
Suppose the time in days between service calls on a photocopier machine follows an exponential
distribution with mean call of 0.02 per day.
a. What is the probability that the time until the machine again requires service exceeds
60days? ans (0.3011942)
b. What is the probability that the time until the machine again requires service is less than
20days. ans (0.32968)
Question 2
The lifetime of a mechanical assembly in a vibration test is exponentially distributed with mean of
400hrs. what is the probability that
a. An assembly on test fails is less 100hrs
b. An assembly operates more than 500hrs before life.
c. An assembly on test fails is at most 200hrs.
NORMAL DISTRIBUTION
34
1. Standardize a normal variable and use standard normal tables.
2. Use the normal distribution as model to solve problems
3. Use the normal distribution as an approximation to the binomial distribution and to the
Poisson distribution.
The normal distribution is one of the most important distributions in statistics. Many measured
quantities in the natural sciences follow a normal distribution and under certain circumstances
it’s also a useful approximation to the binomial distribution and to the Poisson distribution.
The normal variable
is continuous. Its probability density function f(x) depends on its mean
and standard deviation , where
It can be describe as
~ ( ,
( )=
(
)
, −∞ <
√
< ∞
)
Finding probabilities
The probability that X lies between an b is written
( <
you need to find the area under the normal curve between
< ). To find this probability,
and .
One way of finding areas is to integrate, but since the normal function is complicated and very
difficult to integrate, tables are used instead
The standard normal variable, Z
In order to use the same set of tables for all possible values of
, the variable X is
standardised so that the mean is 0 and the standard deviation is 1. Notice that since the variance
is the square of the standard deviation, the variance is also 1. This standardized normal variable
is called Z and
~ (0 ,1).
In general
To standardize X, where
~ (
,
)
35
1. Subtract the mean
2. Then divide by the standard deviation
Therefore
=
ℎ
~ (0 ,1)
Finding the Z- value from standard table
Example 35
Find the value of Z using the standard tables.
i)
P(Z < 0.85)
ii)P(Z > 0.85) iii)P(Z < -1.38)
iv)P(Z > -1.38)
Solution
i)
0.8023
ii) 1 – 0.823 = 0.1977
iii) 0.0838
iv) 1 – 0.0838 = 0.9162
Example 36
Find the following
a) P(0.35 < Z < 1.76)
b) P(-2.70 < Z < 1.87)
c) P(| |< 1.43)
1.433
Solution
( ) ∅(1.76) − ∅(0.35) = 0.9608 − 0.6368 = 0.324
( ) ∅(1.87) − ∅(−2.70) = 0.9693 − 0.0035 = 0.9658
( ) (| | < 1.43
36
d) P(| |>
(−1.433 <
< 1.433)
∅(1.433) − ∅(−1.433) = 0.9236 − 0.0764 = 0.8472
OR
2∅(1.43) − 1 = 2(0.9236) − 1 = 0.8472
) (| | > 1.433)
P (Z < -1.433) + P (Z > 1.433)
2 1 − ∅(1.433)
2(1 − 0.9236) = 0.1528
~ (0 ,1). ℎ
ℎ
(−1.96 <
< 1.96) = 0.95
(−2.58 <
< 2.58) = 0.99
Using standard normal tables for any normal variable X
given as
~ ( ,
) can be standard as
=
where
~ (0 , 1)
Example 37
Length of metal strips produced by a machine are normally distributed with mean length of
150cm and a standard deviation of 10cm. Find the probability that the length of a randomly
selected strip is
a) Shorter than 165cm
b) Within 5cm of the mean
Solution
37
a. X is the
length ;
= 150
;
= 10 ;
~ (150, 10 ). To find prob that- the
length is shorter than 165cm ie P(X < 165)
=
− 150
10
=
165 − 150
= 1.5
10
So ( < 165)
(
< 165) =
( < 1.5)
(
< 1.65) = ∅(1.5) = 0.9332
b. To find the probability that length is within 5cm of the mean, you need to find
(| − 150| < 5)
(−5 <
− 150 < 5)
−5
<
10
− 150
5
<
10
10
(−0.5 <
< 0.5)
(| | < 0.5 ) = 2∅(0.5) − 1 = 2 × 0.6915 − 1 = 0.38
The probability that the length is within 5cm of the mean is 0.38
Example 38
The time taken by milkman to deliver to the high street is normally distributed with a mean of 12
minutes and a standard deviation of 2 minutes. He delivers milk every day. Estimate the number
of days during the year when he takes
a. Longer than 17 minutes
b. less than ten minutes
c. between nine and 13 minutes.
Solution
38
X is the time, in minutes, taken to deliver milk to the high street.
=
Standardized X using
a) ( > 17) =
~ ( 12 , 2 )
=
>
= P(Z > 2.5)
= 1 − ∅(2.5) = 1 − 0.9938
= 0.0062
Find the number of days multiply by 365
365 × 0.0062 = 2.263 ≈ 2
On two days in the year he takes longer than 17 minutes.
a) P(X < 10) =
<
= ( < −1 ) = 1 − 0.8413 = 0.1587
Now 365 × 0.1587 = 57.92 ≈ 58
On 58 days in the year he takes less than ten minutes
b)
(9 <
< 13) =
= (−1.5 <
<
<
< 0.5 )
= ∅(0.5) − ∅(−1.5)
= 0.6915 − 0.06668
= 0.6247
Now 365 × 0.6247 = 228.01 ≈ 228
On 228 days in the year he takes between nine and 13 mins
NB
Since X is continuous variables, the following are indistinguishable:
39
9 < X < 13
9 ≤ X < 13
9 ≤ X ≤ 13
9 < X ≤ 13
Question 7
The masses of packages from a particular machine are normally distributed with mean of 200g
and standard deviation of 2g. Find the probability that a randomly selected package from the
machine weights.
a) Less than 197g b) more than 200.5g c) between 198.5g and 199.5g
Ans: a. 0.0668
b. 0.4012
c. 0.1746
Question 8
The heights of boys at a particular age follow a normal distribution with mean 150.3cm and
variance 25cm. Find the probability that a boy picked at random from this age group has height
a) Less than 153cm
b) more than 158cm c) between 150cm and 158cm
d) more than 10cm difference from the mean height
Question 9
The masses of certain type of cabbage are normally distributed with a mean of 100g and a
standard deviation of 0.15kg. In a batch of 800 cabbages, estimate how many have a mass
between 750g and 1290g.
Example 39
40
Using the standard normal tables in reverse to find Z when ∅( ) is known.
~ (0 , 1), find the values of a if
a) P(Z < a) = 0.9693
e) (| | <
b) P(Z >a) = 0.3802
) = 0.9
Solution
a)
( <
) = 0.9693
∅( ) = 0.9693
= ∅ (0.9693)
= 1.87
b)
(
>
) = 0.3802
1 − ∅( ) = 0.3802
1 − 0.3802 = ∅( )
∅( ) = 0.6198
= ∅ (0.6198)
= 0.30 or 0.31
= 0.305
Solve c) and d) c) = -0.633
c)
d) = -1.41
(| | < ) = 0.9
(− <
< ) = 0.9
2∅( ) − 1 = 0.9
2∅( ) = 1.9
∅( ) = 0.95
41
c) P(Z > a) = 0.7367 d) P(Z < a ) = 0.0793
= ∅ (0.950)
1.64 or 1.65
Using the table in reverse for any normal variable x
Example 40
Bays of flour packed by a particular machine have masses which are normally distributed with
mean 500g and standard deviation 20g. 2% of the bags are rejected for being underweight and
1% of the bags are rejected for being overweight. Between what ranges of values should the
mass of a bag of flour lie if it is to be accepted?
= 500
Given
<
= 20
>
= 0.01
( > ) = 0.01
= 0.02
( < ) = 0.02
1 − ∅( ) = 0.01
∅( ) = 0.02
1 − 0.1 = ∅( )
= ∅ (0.02)
0.99 = ∅( )
= −2.06
Z = 2.32
= −2.06
= 2.32
− 500 = 20(−2.06)
x – 500 = 2.32(20)
− 500 = −41.2
x = 546.4
= 458.8
a) Find the limit within which the central 95% of the distribution is.
b) Find the inter quartile range of the distribution.
Solution
a)
~ (400 , 8 )
(| | < ) = 0.95
42
2∅( ) − 1 = 0.95
2∅( ) = 1 + 0.95
1.95
∅( ) =
2
∅( ) = 0.975
= ∅ (0.975) = ±1.96
=
= ± ± 1.96
− 400 = (±1.96)(8)
(384.32, 415.68)
b)
The inter quartile range encloses the central 50% of the distribution between the lower
quartile
and upper quartile .
∅( ) = 0.75
= 0.67
− 400
= 0.67
8
− 400 = (0.67)(8)
− 400 = 5.36
∅( ) = 0.25
= ∅ (0.25)
= −0.67
− 400
= −0.67
8
− 400 = (−0.67)(8)
= 394.64
(394.64, 405.36)
Question 9
43
A sample of 100 apples is taken from a load. The apples have the following distribution of sizes
Diameter to
nearest cm
frequency
6
7
8
9
10
11
21
38
17
13
Assuming that the distribution is approximately normal with mean and this standard deviation.
Find the range of size of apples for packing, if 5% are to be rejected as too small and 5% are to
be rejected as too large. ( ̅ = 8 , = 1.16
1.158 6.10 , 9.90 )
Question
The lengths of metal strips are normally distributed with a mean of 120cm and a standard
deviation of 10cm. Find the probability that a strip selected at random has length
a) Greater than 105cm
b) within 5cn of the mean
Strips that are shorter than Lcm are rejected. Estimate the value of L, correct to one decimal
place, if 9% or all strips are rejected. In a sample of 500 strips, estimate the number having a
length over 126cm. b) (0.383; 106.6, 137)
Question 10
Batteries for a transistor radio have a mean life under normal usage of 160 hours, with a standard
deviation of 30 hours. Assuming the battery life follows a normal distribution.
a) Calculate the percentage of batteries which have a life between 150 hours and 180 hours.
Ans: 37.8%
b) Calculate the range, symmetrical about the mean, within which 75% of the battery lives
lie.
Ans:125.5 , 194.5
If a radio takes four of these batteries and requires all of them to be working, calculate
c) The probability that the radio will run for at least 135 hours. (0.405)
Question 11
The numbers of shirts sold in a week by the world’s largest menswear store a normally
distributed with mean of 2080 and a standard deviation of 50. Estimate
44
a)
b)
c)
d)
The probability that in a green weak fewer than 2000 shirts are sold. (0.0548).
The number of weeks in a year that between 2060 and 2130 shirts are sold. (26)
The inter quartile range of the distribution. (67.4)
The least number of shirts such that the probability that more than n are sold in a given
week is less than 0.02. (2183)
Finding The Values of
or
or Both
Example 41
The random variable X is distributed
( ,
) with
If P(X < 27.5) = 0.3085, find the value of .
Solution
( < 27.5) = ( < )
27.5 −
5
=
( < ) = 0.3085
∅( ) = 0.3085
= ∅ (0.3085)
= −0.5
27.5 −
5
= −0.5
27.5 −
= 5(−0.5)
27.5 −
= −2.5
= 30
Example 42
45
= 25
The random variable X is normally distributed with mean of 45. The probability that X is greater
than 51 is 0.288. Find the standard deviation of the distribution.
Solution
( > 51) = ( > )
=
51 − 45
=
6
( > ) = 0.288
1 − ∅( ) = 0.288
1 − 0.288 = ∅( )
∅( ) = 0.712
= 0.56
6
= 0.56
=
6
= 10.71
0.56
Example 43
The random variable X is distributed
Find
.
( ,
) . P( X> 80) = 0.0113 and P( X < 30) = 0.0287.
Solution
( > 80) = ( > )
1 − ∅( ) = 0.0113
0.9887 = ∅( )
= ∅ (0.9887)
= 2.28
=
80 −
46
80 −
= 2.28
+ 2.280 = 80 … … . (1)
( < 30) = ( < )
=
30 −
∅( ) = 0.0287
= ∅ (0.0287)
= −1.90
30 −
= −1.90
− 1.90 = 30 … … … . (2)
+ 2.280 = 80
= 52.73
= 11.96
Question 12
The masses of boxes of apples are normally distributed such that 20% of them are greater than
5.08kg and 15% are greater than 5.62kg. Estimate the mean and standard deviation of the
masses. (2.74, 2.78)
Question 13
A farmer cuts hazel twigs to make into bean poles to sell at the market. He says that a stick is
240cm long. In fact the lengths of the sticks are normally distributed and 55% are over 240cm
long. 10% are over 250cm long. Find the following
a)
b) The probability that a randomly selected stick is shorter than 235cm. (0.203)
Question 14
47
Tea is sold in packages marked 750g. The mass of the packages are normally distributed with a
mean of 760g. It is know that less than 1% of the packages are underweight. What is the
maximum value of the standard deviation of the distribution? (4.299g)
Question 15
The random variable X is normally distributed. The probability that X is less than 53 is 0.04 and
the probability that X is less than 65 in 0.97. Find the inter quartile range of the distribution.
(4.46)
CHAPTER TWO
PROBABILITY CALCULUS
48
CONCEPT OF PROBABILITY
This is a measure of how likely it is that something will occur. In talking about probability we
need an experiment.
An Experiment is any action with outcomes that are recorded data. The number of times we do it
is sample space. It is the set of all possible outcomes of an experiment. It is denoted by “S”. When
a coin is tossed twice, first outcome is {H,T}, second outcome is {HT,TH,TT,HH}. The sample
space is therefore four for the example given. We have ;
Finite Sample Space: A sample space which takes integer values or has countable number.
Infinite Sample Space: the ages of a class can range from 17 to 30 i.e. 17 ≤ x ≤ 30. An
individual can start from 17 and start counting 18, 19… Another person can use 17.01,
17.02, 17.03… etc. This makes it infinite.
Event
An event ‘A’ is an outcome or the set of outcomes that are of interest to the experimenter. The
probability that an event A would occur is written as ( ) and is” read probability of A”. The
probability of an event A, ( ) is a measure of the likelihood that an event A would occur i.e
( )=
Example 1
An ordinary die is thrown. Find the probability that the number obtained
a. Is a multiple of 3
b. Less than 7
c. A factor of six
Solution
Sample space when die is thrown = {1, 2, 3, 4, 5, 6}
49
a. P(multiple of 3) = =
b. P(less than 7) = = 1
c. P(factor of six) = =
Compliment of an Event
The compliment of an event A is denoted by
. If the set of all outcomes in the sample space
“S” , that do not correspond to an event “A” .
( )+ ( )=
Probability Rule
( ∪ ) = ( ) + ( ) − ( ∩ )………….( )
(1)
(
)
.
=
( ∪ )= ( )+ ( )− ( ∩ )
( )= ( ∩ )+ ( ∩
( )= ( ∩ )+ (
)
∩ )
Example 2
Given that ( ) = , probability of
= and ( ∩ ) =
Solution
( ∪ ) = [(1 − ( ) ] + ( ) − ( ∩ )
=
1−
+ −
=
+
=
50
. Find ( ∪ )
ℎ
Exhaustive Events
If two events A and B are such that between them, they make the whole of the possibility space,
then A and B are said to be Exhaustive events and ( ∪ ) = 1
⟹ 1− ( )+ ( )− ( ∩ )
Example
= {1,2, … ,10}
= {2,4,6, … ,10}
= {1,3,5, … 9}
( ∪ ) = {1,2,3,4, … ,10} =
Exclusive or mutually exclusive events
A and B are said to be exclusive or mutually exclusive events if there is no intersection between
them; i.e. if they cannot occur at the same time. It is expressed mathematically as
( ∪ )= ( )+ ( )
A
i.e P(A∩ ) = 0
B
Example 3
It is known that ( ) =
( ) = , given that X and Y are mutually exclusive, find
51
a.
( ∪ )
. ( ∩ )
. ( ∩
)
. (
∩
)
Solution
a.
( ∪ )= ( )+ ( )
=
1 1
+
2 4
=
2+1 3
=
4
4
b.
( ∩ )=0
c.
( ∩
)=
Demorgan’s Rule
( ∪ ) =
− ( ∪ )= (
∩
)
( ∩ ) =
− ( ∩ )= (
∩
)
Example 4
Given that ( ) = then ( ) = then ( ∩ ) =
Solution
( ∪ )= ( )+ ( )− ( ∩ )
=
1 1 1
+ −
3 2 12
=
4+6−1
9
3 3
=
= =
12
12 4 4
52
find ( ∩ )
Question
Events A and B are mutually exclusive and exhaustive events. ( ) = 0.4. Find
a.
( )
b.
( ∩ )
Solution
a.
( )+ ( )=1
( ) = 1 − 0.4 = 0.6
c.
( ∩ )=0
ℎ
Example 5
A and B are two events such that ( ) =
then ( ) = and ( ∩ ) = . Are A and B
exhaustive events.
Solution
For exhaustive events
( ∪ )=1
∴ ( ∩ )=
=
2 8 1
+
−
3 15 5
10 + 8 − 3 15
=
=1
15
15
Independent Events
53
If either of the events A and B can occur without being affected by the order, then the two (2)
events are independent.
For independent events;
( ∪ ) = ( )+ ( )− ( )∙ ( )
⟹ ( ∩ ) = ( )∙ ( )
Example 6
If events A and B are such that they are independent and ( ) = 0.3 and ( ) = 0.5. Find
a.
( ∩ )
b.
( ∪ )
c. Are A and B mutually exclusive
Solution
a.
( ∩ )= ( )∙ ( )
= 0.3 × 0.5 = 0.15
b.
( ∪ ) = ( ) + ( ) − ( ) ∙ ( ) = 0.3 + 0.5 − 0.15
= 0.8 − 0.15
= 0.65
c. No
Example 7
The probability that an event A occurs, ( ) = 0.4, B is an event independent of A and the
probability of the union of A and B ( ∪ ) = 0.7. Find ( ).
Solution
( ∪ )= ( )+ ( )− ( ∩ )
= ( ) + ( ) − [ ( ) ∙ ( )]
54
0.7 = 0.4 + ( ) − [0.4 × ( )]
0.3 = ( ) − 0.4 ( )
= ( )[1 − 0.4]
0.3 = 0.6 × ( )
( )=
0.3 3 1
= = = 0.5
0.6 6 2
CONDITIONAL PROBABILITY
If A and B are two events not from the same experiment then the conditional probability that A
occurs if given B has already occurred is written as ( ,
)=
( ∩ )
( )
( ⁄ )=
( ∩ )
( )
( ⁄ )=
( ∩ )
( )
( ,
)= ( ⁄ )
⟹ ( ⁄ )∙ ( )= ( ∩ )
( ⁄ )∙ ( )= ( ∩ )
( ⁄ )∙ ( )= ( ⁄ )∙ ( )
Example 8
and are two events such that ( ⁄ ) = 0.4
( ) = 0.25 and ( ) = 0.2. Find
55
( ⁄ )
a.
. ( ∩ )
. ( ∪ )
Solution
( ⁄ )∙ ( )= ( ⁄ )∙ ( )
a.
0.4 × 0.25 = ( ⁄ ) ∙ 0.2
0.1 = ( ⁄ ) ∙ 0.2
( ⁄ )=
0.1
= 0.5
0.2
( ∩ )= ( ⁄ )∙ ( )
b.
= 0.5 × 0.2
= 0.1
( ∪ )= ( )+ ( )− ( ∩ )
c.
= 0.25 + 0.2 − 0.1
= 0.35
Example 9
If ( ⁄ ) = and ( ) = and ( ) = . Find
( ⁄ )
a.
. ( ∩ )
Solution
a.
( ⁄ )∙ ( )= ( ⁄ )∙ ( )
( ⁄ )×
1 2 1
= ×
3 5 4
=
( ⁄ )=
2
1
=
20 10
1 3
3
× =
10 1 10
56
b.
( ∩ )= ( ⁄ )∙ ( )
=
2 1
2
1
× =
=
5 4 20 10
Independent (Conditional Events)
If
and
are independent, then;
( ⁄ )= ( )
( ⁄ )= ( )
( ⁄ )= ( )
( ⁄ )= ( )
Example 10
and
are two independent events such that
following probabilities;
a.
( ⁄ )
. ( ∩ )
. ( ∪ )
Solution
For independent events
a.
( ⁄ ) = ( ) = 0.2
b.
( ∩ ) = 0.2 × 0.15
c.
( ∪ )= ( )+ ( )− ( ∩ )
= 0.2 + 0.15 − 0.03 = 0.32
Example 11
57
( ) = 0.2 and
( ) = 0.15 . Evaluate the
and are exhaustive events and it is known that ( ⁄ ) = , ( ) = . Find ( )
Solution
( ⁄ )∙ ( )= ( ∩ )
1 2 1
× =
4 3 6
( ∪ )= ( )+ ( )− ( ∩ )
2 1
1= ( )+ −
3 6
1 2 1
( ) = 1 + − = = 0.5
6 3 2
Example 12
A box contains 10 balls, of which 6 are red and 4 are blue. If 2 balls are randomly selected from
the box without replacement, what is the probability that both are red?
Solution
Let A be the event that the first ball drawn is “red”, and
B, the event that the second is “red”.
We are required to calculate P ( A  B ) .
The probability that the first ball drawn is red is
P ( A) 
6 3

10 5
P( B A) 
5
9
58
3 5 1
P ( A  B )  P( A) P ( B A)   
5 9 3
Example 13
In a consignment of 40 manufactured items, 8 are known to be defective. Suppose three items are
drawn at random without replacement. What is the probability that all three in the sample are
defective?
Solution
Letting A1 , A2 and A3 be the events, “getting a defective on the 1st, 2nd and 3rd draw respectively,
the desired probability becomes
P ( A1  A2  A3 )  P ( A1 ) P ( A2 A1 ) P ( A3 A1  A2 )

8 7 6
7
 

40 39 38 1235
Use of Combinatorial Analysis
The application of the multiplication rule in solving probability problems may sometimes be
tedious or confusing. An easier approach is the application of a method of first principles, the
combinatorial analysis.
Example 14
RefertoExample13
(a) Solve the question using the combinatorial method.
(b) Calculate the probability that the sample contains just one defective.
Solution
The sample space for this problem is the set of all possible 3-tupples defective items that could be
selected from 40 items so that the sample space consists of
59
40
C 3 equally likely simple events.
(a) There are 8 defective items and so 3-tuples of defective items can be selected in 8 C 3
number of ways.
8
P(all three defectives) 
C3
7

C3 1235
40
8
(b) P(exactly one defectives) 
C1 .32 C 2
496

40
1235
C3
Total Probability
If A1 , A2 , …, An form a partition of the sample space S, then for any event B  S , P ( B)  0 ,
P( B)  P( A1  B)  P( A2  B)  .. .  P( An  B)
n
  P ( Ai  B )
i 1
Definition:
If A1 , A2 , …, An form a partition of the sample space S and B an event defined on the same sample
space S such that P ( B)  0 . Then.
n
P ( B )   P ( Ai )P ( B Ai )
i 1
 P( A1 ) P( B A1 )  P( A2 ) P( B A2 )  .. .  P( An ) P( B An )
Bayes’ Theorem
60
Suppose A1 , A2 , …, An form a partition of the sample space S. Suppose also that the probabilities
P( Ai )  0 (I = 1, 2, . . .,n) are known. Let B be any event in S such that P ( B )  0 and suppose
P( B A) is also known. Then
P ( Ai B ) 
P ( Ai ) P ( B Ai )
n
 P( A ) P( B A )
j
j
j 1
Or
P ( Ai B ) 
P ( Ai ) P ( B Ai )
P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  ...  P ( An ) P ( B An )
Before we use this theorem or rule, the following condition must be present;
1. We are dealing with an experiment which can result in one of n mutually exclusive events,
A1 , A2 , …, An such that the sample space S is given by S  { A1  A2  ....  An } ,
2. It is given that the event B has occurred such that P( B)  0
3. We want to find the probability that one of the events A1 , A2 , …, An will occur given that
event B has occurred. That is, we want to find P( A i B), i= 1, 2, . . .,n) .
Example 15
Suppose P ( A1 )  0.20 , P ( A2 )  0.40 , P ( A3 )  0.40 , P( B A1 )  0.25 , P( B A2 )  0.05 and
P( B A3 )  0.10 . Use Bayes’ rule to find P( A1 B),
Solution
P ( Ai B ) 
P ( A1 ) P ( B A1 )
P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 )
61
(0.20)(0.25 )

(0.20)(0.25)  (0.40)(0.05)  (0.40)(0.10)
0.050
0.050


 0.455
0.050  0.020  0.04 0.11
Example 16
It is given that only 60% of the students in Mr. Mensah’s class passed the mathematics test at first
sitting. Of those who passed, 80% prepared for the test and of those who failed, 20% prepared for
the test. What is the probability that a person who passes prepared for the test?
Solution
Let define the events,
A1 ; Passing the test at a first sitting
A2 ; Failing the test at first sitting
Then the prior probabilities are,
P( A1 )  0.60 , P( A2 )  0.40 ,
If we let B denote the event that a person prepared for the test, then the conditional probabilities
are P( B A1 )  0.80 , P( B A2 )  0.20
We are required to find the posterior probability P( A1 B), that a person who prepared for the test
passed the test at first sitting.
Using Bayes’ rule for two mutually exclusive events, we have
P ( Ai B ) 
P ( A1 ) P ( B A1 )
P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )
62
(0.60)(0.80)
(0.60)(0.80)  (0.40)(0.20)
0.48
0.48


 0.86
0.48  0.08 0.56

Example 17
A Mechanical factory employs three machine operators to produce its brand of goods. Operator
A works 50% of the time, Operator B works 30% of the time, and C, 20% of the time. Each
operator is prone to produce defective items. Operator A produces defective items 1% of the
time, Operator B produces defective items 5% of the time, and Operator D produces defective
items 7% of the time. If a defective item is produced, what is the probability that it was produced
by
(a) Operator A
(b) Operator B
(c) Operator C
Solution
Let us define the events
A1 ; Operation A,
A2 ; Operation B and
A3 ; Operation C
The prior probabilities are P ( A1 )  0.50 , P ( A2 )  0.30 , P( A3 )  0.20 .
We also know that P( B A1 )  0.01, P( B A2 )  0.05 , P( B A3 )  0.07
(a) P ( Ai B ) 

P ( A1 ) P ( B A1 )
P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 )
(0.50)(0.01)
 0.147
(0.50)(0.01)  (0.30)(0.05)  (0.20)(0.07)
(b) P ( A2 B ) 
P ( A2 ) P ( B A2 )
P( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 )
63

(0.30)(0.05)
 0.441
(0.50)(0.01)  (0.30)(0.05)  (0.20)(0.07)
(c) P ( A3 B ) 

P ( A3 ) P ( B A3 )
P ( Ai ) P ( B A1 )  P ( A2 ) P ( B A2 )  P ( A3 ) P ( B A3 )
(0.20)(0.07)
 0.412
(0.50)(0.01)  (0.30)(0.05)  (0.20)(0.07)
Question 1
Two events A and B are such that ( ) =
, ( )=
( ⁄ ) = . Calculate the
probability that
a. Both events occur
b. Only one of the two events occurs
c. Neither events occur
Question 2
Three balls are drawn at random from a bag containing 15 green and 12 yellow balls. What is the
probability that
(a) all three are green?
(b) none is green?
(c) one is green and the other two are yellow?
Question 3






1. P(A1)=0.20, P(A2)=0.35, P(A3)=0.45, P B A1  0.01 , P B A2  0.05 and P B A3  0.01 .
Use Bayes’ rule to find
(a) P A1 B 
(b) PA2 B 
(c) PA3 B 
64
Question 4
A Mechanical factory employs three machine operators to produce its brand of goods. Operator
A works 50% of the time, Operator B works 40% of the time, and C, 10% of the time. Each
operator is prone to produce defective items. Operator A produces defective items 1% of the
time, Operator B produces defective items 3% of the time, and Operator D produces defective
items 5% of the time. If a defective item is produced, what is the probability that it was produced
by
(b) Operator A
(b) Operator B
(c) Operator C
65
CHAPTER 3
ESTIMATES AND ESTIMATORS
Estimator: It is the rule of procedure(usually expressed as a formula), that is used to derivethe
 xi is the estimate of the population mean. An estimator is therefore
estimate. For example, x 
n
the process by which the estimate is obtained.
Estimate: An estimate is a numerical result of the estimator. For example, if the value of the
estimator x is say 15, then 15 is the estimate of the population mean.
Properties of estimators
The four properties of estimators are unbiased, efficient, consistent and sufficient estimators.
Unbiased Estimator: An estimator is unbiased if the mean of the sampling distribution equals the
corresponding parameter. For example, if theta (  ) is a parameter we are trying to estimate, then
“theta hat” ( ˆ ) is an unbiased estimator if its mean or expected value E (ˆ)   . Furthermore, if
E ( x )  x   , then x is an unbiased estimator of  . The measure of bias is the difference
between the mean of ˆ and  . Hence, if E (ˆ)    0 , then ˆ is a biased estimator of  .
Efficient Estimator: Given any unbiased estimators, the most efficient estimator is the one with
the smallest variance. For example, let ˆ1 and ˆ2 be two unbiased estimators of  . ˆ1 is more
efficient estimator if in repeated sampling with a given size, its variance is less than that of ˆ .
2
Consistent Estimator: An estimator is consistent when as n increases, the value of the statistic
approaches the parameter. Hence, for an estimator to be consistent, it must be unbiased and its
2

x
variance must approach zero as n increases. Thus  
2
, as n gets larger  x2 will approach zero
n
and therefore x is said to be consistent estimator of  . This also suggests that if a statistic is not
a consistent estimator, taking a larger sample to improve the estimator will be fruitless.
Sufficient Estimator: An estimator is said to be sufficient if no other estimator could provide
more information about the parameter. Furthermore, an estimator is sufficient if it uses all relevant
information about the parameter contained in the sample. If an estimator is sufficient, nothing can
be gained by using other estimator.
66
ESTIMATION OF PARAMETERS
For the purpose of this course, we shall look at point estimation of population mean, proportion,
standard deviations and interval estimations.
Point Estimation
If a single number is used to approximate the true value of a parameter, the number is called a
point estimate of the parameter. This is usually done since there may be no prior knowledge of the
parameter before the study of the population.
Point Estimate of the Population Mean
An engineering students was interested in the diameters of 2-inch nails. The mean of the population
was  . Since it was not easy to work with all 2-inch nails in the world, he took a sample with a
mean x to estimate the population parameter, but with a property of unbiasness.
Example, the manager of an automobile company wishes to determine the number of vehicles to
order each week. If he orders too many, there will be a problem of space; if too few, the supply
may run out. To get some idea of the number required, a sample of twelve randomly selected week
is obtained, and the number of vehicles sold per week recorded as shown: 10, 12, 7, 7, 9, 9, 11, 15,
13, 6, 7, 8. If it is not possible to obtain the true value of the population mean because it is infinite.
 xi  
Here, the sample mean will be the best estimate for the population mean. Thus x 
n
Point Estimation of the population proportion
The population parameter P is the ratio of the number of the population with the attribute to the
X
population size. This is given by P  . Since P is unknown, an estimate (sample proportion) is
N
x
used. This is given by pˆ  . i.e., the ratio is the number x in the sample with the attribute to the
n
sample size. Note that the sample proportion is also an unbiased estimator of the population
proportion.
For example, suppose in Ghana, 1000 Filling Stations were selected and after taking their fuels
trough testing, it was found that 500 of them sell impure fuels to drivers. Then the sample
500
proportion ( pˆ ) is given by pˆ 
 0.50 or 50%
1000
67
Interpretation: It is estimated that 50% of the fuel dealers in Ghana sell impure fuels to their
customers. If the population of the Filling Stations in Ghana is 5000, we may conclude that 2500
Filling Station operators on Ghana sell impure fuel to customers.
Point Estimation of the Population Standard Deviation
The sample estimate for the population variance is s 2 
1 n
 ( xi  x ) 2 ; xi is the sample
n i 1
observation and i  1, 2, 3, ... , n . For the sample variance to be unbiased estimator of the
population variance we need to correct the undesirable property of the denominator with n  1 .
1 n
Hence the unbiased estimator of the population variance is given by s 2 
( xi  x ) 2 . Also

n  1 i 1
s
1 n
 ( xi  x ) 2 . Hence the positive square root of the variance is used as an unbiased
n  1 i 1
estimator of the population standard deviation. Note that for large sample size, there is little or
1 n
1 n
no difference between s 2   ( xi  x ) 2 and s 2 
( xi  x ) 2 .

n i 1
n  1 i 1
Interval estimation
Point estimates are the most commonly used. Point estimates are expected to coincide with the
parameter they intend to estimate even if they are biased. They are unable to tell us the size of
the errors of the estimations. Hence interval estimates are therefore suitable.
Interval estimate (confidence interval) is an interval of numbers which are used to estimate or
approximate the true value of a parameter  . To estimate the true value of  , an interval of
numbers L1 and L 2 (end points) are determined with confidence coefficient 1   , such that the
probability of estimating the true value of the population with a certain interval is
P ( L1    L 2 )  1   . For example, a confidence coefficient of 0.95 means that if 100 different
samples are drawn and for each sample, an interval estimate for the unknown parameter  is
calculated, then at least 95% of these confidence intervals would include  . Hence, we are 95%
sure that the parameter will be between L1 and L 2 .
Confidence Interval on the Mean (Variance Known)
Consider a random sample of size n taken from a normal population with mean  and variance
 2 . Then a 100(1   )% confidence interval on  is given by
68
x  Z .

   x  Z .
n
2
2
Or simply x  Z  .

n

n
2

, where x is the sample mean.
is the standard error of the mean.
n
Example 1
If n = 25, x  9 and   3 , find the 95% confidence interval on the mean  .
Solution
100(1   )%  95%    0.05 or 5%.
Z   Z 0.05  Z 0.025  1.96 from the table.
2
2
But we know that x  Z  .
2
Therefore 9  1.96 (
3
25

n
   x  Z .
)    9  1.96 (.

2
n
3
)
25
 7.82    10.18
Interpretation: We are 95% sure that the true mean will lie between 7.82 and 10.18.
Confidence Interval on the Mean (Variance Unknown)
When the value of the population standard deviation is unknown, we replace it with the
corresponding standard deviation s. The expression will hereafter follow the t-distribution with
x  o
and n-1 degree of freedom. The confidence interval on the mean therefore becomes
t
s
n
x  t .
2
s
n
. To read the t value from the table we use t  .( n  1) .
2
Example 2
With reference to Example 2.1, if the value of  was unknown, but its estimate was to be s  2.8
. find the 95% confidence interval on the mean  .
69
Solution
Note that  is not known.
n = 25, x  9 , s  2.8 and t 0.025 .( 24 )  2 .064
x  t .
2
s
n
9  2.064 (
2 .8
25
)    9  2.064 (.
2 .8
25
)
 7.82    10.18
Therefore the confidence interval for the mean is 7.82    10.18 .
Confidence Interval onthe Difference between Two Means;
Given two independent random samples with means ̅ and ̅
their respective sizes n1 and n2
from normal populations with means µ1 and µ2 and variances
and we can determine the
confidence interval on µ1 and µ2 considering three separate conditions (cases)
Case 1
Where and , are known, the confidence interval on the difference between the two population
means is given by:
( ̅ - ̅ )-
/
+
≤ µ1 − µ2 ≤ ( ̅ - ̅ ) +
/
+
Case 2
Where
and
are known butn1 and n2 are large, we use the sample variance in place of the
population variance. The confidence interval on µ1 - µ2 is therefore given by
70
( ̅ - ̅ )-
+
/
≤ µ1 − µ2 ≤ ( ̅ - ̅ ) +
/
+
Example 3
Two sample of sizes 100 and 64 observations were drawn from independent normal populations
with variances 16 and 25 and sample means 10.8 and 9.6 respectively. Find a 95% confidence
interval on the difference between the two population means (µ1 - µ2).
Solution
n1 = 100, n2 = 64,
̅ = 10.8,
= 16,
̅ = 9.6,
= 25
The required equation is
( ̅ - ̅ )±
(10.8 - 9.6) -
+
/
.
+
(-0.25, 2.65)
Hence the confidence interval is
-0.25 ≤ µ − µ ≤ 2.65
Case 3
Where
and
are unknown, but n1 and n2 are small, the appropriate test statistic used is the
t-distribution. Here, the sample variance common to all (pooled variance) is preferred. Hence, the
confidence interval on the difference between the population means is given by
71
( ̅ - ̅ )-
≤µ − µ ≤( ̅ - ̅ )-
+
∝
If we assume that
=
∝
is with
+
(
− 1)
Where
∝
∝
+
……… (1)
+ ( − 1)
+ −2
– 2 degree of freedom.
Secondly if no assumption of equality of
population means is given by
( ̅ - ̅ )±
+
then,
=
Note that
∝
and
is made then the confidence interval on the
………………….. (2)
is approximately t-distribution with the degree of freedom given by;
d.f =
It has been proved that if the sample sizes are equal, then difference between equation 1 and 2 is
very small (negligible) and hence, the formula 1 is suitable.
TRY
Two independence samples of sizes n1 = 16 and n2 = 10 from a normal population with unknown
standard deviations have sample means, ̅ =23.4 and ̅ =18.2and corresponding S1 = 3.5 and S2 =
4.8. Find a 90% confidence interval on (µ1 - µ2)
a. Assuming that the population standard deviation are equal
b. Assuming that the population standard deviations are not equal.
72
ANS: a) 2.41 ≤ µ1 - µ2 ≤7.99b) 1.67 ≤ µ1 - µ2 ≤8.73
Choosing an Appropriate Sample Size
This is very essential in statistical study or analysis. It deals with the sample size to take, so that
your results will be closer to the population parameter, thereby, the sample being representative of
the population of study. This is important because, a sample being too large, wastes money, in
terms of data collection and a sample being to small results in conclusions which are uncertain or
which leads to bigger standard errors (errors).
The Correct Sample Size Depends On 3 Factors:
1. The level of confidence desired. Very often 95% and 99% confidence level are selected
leading to Z-values of ± 1.9 and ± 2.58 respectively. The higher the level of significance,
the larger the size of the sample.
2. The margin of error the researcher will tolerate. The maximum allowable error E, is the
amount that is added and subtracted from the sample mean to determine the end points or
limits of the confidence intervals. It is the amount of error the researcher is willing to
tolerate. A small allowable error means one should take a large sample and allowable error
requires a smaller sample.
3. The variability in the population being studied. This deals with the population standard
deviation. If the population is widely dispersed, a large sample is needed. Also if the
population is concentrated or homogeneous (i.e. no widely dispersed) a smaller sample is
required. However, estimating for the population standard deviation may be necessary and
those are considerations.
a) Use the comparable study approach, when there is an estimate of dispersion
available from another study. Fall on that to get an idea of the rough sample size
and estimate from that
b) If not available a range-based approximation can be set. That is the correct
observation lies within ± 3 standard deviations of the mean if the distribution is
normal. Establishing the largest and the smallest values (in all 6 ).
c) A pilot survey conducted can help estimate the standard deviation. This helps to
test the validity of our questionnaire too. Here a very small sample is taken and the
standard deviation computed.
From the 3 factors:
E=Z
√
73
∴ Sample size for estimating a mean is
n=
.
n
is the size of the sample
z
is the standard normal value corresponding to the desired level of confidence
s
is an estimate of the population standard deviation
E
is the maximum allowable error
To determine the sample size for a proportion:
1. The desired of confidence, usually 95% or 99%
2. The margin of error in the population proportion. This is required.
3. An estimate of the population proportion.
Determination of Sample Size for Estimation:
We have observed from previous discussions on estimation that to obtain maximum accuracy or
precision in estimation, it will depend to a large extend on the sample size. Hence, a large sample
size ensures precise estimation of the confidence interval on the mean. To determine the least
sample size with a 100(1-α) % confidence interval, we use the formula.
n≥
Where d =
√
and d is the distance between the center of confidence and the upper confidence
bound. Note also that n is always a whole number.
Example 3.4
To estimate the mean height of males in a certain community to within 2cm with 99% confidence
and the standard deviations is 6cm at a minimum sample size is given by
Z0.005 = 2.58
74
⇒n≥
n≥
( .
)
n≥ 59.91 ⇒ n=60
Thus, the sample size required to estimate the mean height of males to within 2cm with 99%
confidence interval is 60
HYPOTHESIS TESTING
A hypothesis is a statement which is yet to be proved true or otherwise. In hypothesis testing, an
idea concerning a parameter is available before the study and the purpose the study and the purpose
of the study is to collect data to confirm or otherwise the stated idea. There are two types of
hypothesis:
a. The hypothesis available before the research and
b. Its negative
The hypothesis before the research is conducted is called the null hypothesis and it is usually
denoted by H0. Whereas is negative is called the alternative hypothesis and it is denoted by H1.
In fact, the purpose of hypothesis testing is to reject or refute the null hypothesis (H0). Hence H0
can either be rejected or we fail to reject H (i.e. accept).
There are four possible decisions involved and we therefore summaries then on the table:
H0 is true
Reject H0
Type I error
Fail to reject H0
Correct decision
The following conclusions are arrived at




H0 is false
Correct decision
Type II error
Reject H0 when it is true (wrong decision) – type I error
Reject H0 when it is false (correct decision)
Fail to reject H0 when it is true (correct decision)
Fail to reject H0 when it is false (wrong decision)- type II error.
Note that our decision to reject the null hypothesis or otherwise will be based on the test statistic
– the value calculated from the sample data. H0 or otherwise will be based on the test statistic –
the value calculated from the sample data. There are two ways of choosing between H0 and H1.
One way is to find the rejection region (critical region) of the test. Thus, if the calculated value of
the test statistic is greater than the “table value”, then H0 is rejected and vice versa. The second
75
way is to calculate the P- value of the test. The P-value of the test statistic at least as extreme as
that observed under the null hypothesis. The H0 is rejected for “small” P-value. That is P<0.05. If
V cannot be rejected, the conclusion is the there is no enough evidence to reject H0 or we conclude
that we fail to reject H0 due to insufficient evidence.
The use of a particular test of hypothesis depends to a target extend on the nature of data (either
quantitative or qualitative).
Test for equality in Means:
1. If we wish to test the H0 that µ
= µ , three alternatives can be performed as follows:
i)
H0: µ = µ ,
ii) H0: µ = µ ,
iii)H0: µ = µ ,
H1: µ ≠ µ ,
H1: µ < µ , H1: µ > µ ,
Note that
i.
ii.
Is called a two sided or two-tailed test (non-directional) and
Are called one-sided or one tailed test (directional)
Suppose that the population we are sampling from is normal and
use is the normal deviate (Z).
Thus Z =
is known, the test statistic to
̅
√
Using α level, the critical regions for testing the hypothesis against the alternative hypothesis are
summarized in the table below;
H1
µ<µ
µ>µ
µ≠µ
Reject H0 if
<>
<- or Z>
To use the P-value instead of the α-value we find it using the formula below.
76
≥
P=P
i.e. the p-value is the probability of the z-value. We reject H0 if the P-value
√
<0.05 vice versa.
Suppose that the population we are sampling from is normally distributed and is unknown and
the sample size is small, then for H0:µ=µ0 against the alternatives, the appropriate test statistic to
use is the t-test, given by
=
√
, With (n-1) degree of freedom
Example 5
Suppose we wish to test H0:µ=10 against one sided alternative (H1:µ>10) at=0.05 given that n =
64, = 10.3 and S = 4, then we proceed as follows
Solution
Sine n is large (i.e. n> 30) we use the Z
Hence Z=
.
=0.60
√
Now Z0.05 = 1.65
Using method 1, we compare Z and Za values; since Z = 0.60 <Z0.05 = 1.65, we fail to reject the
H0 and conclude that µ = 10.
Using method 2, we can find the P-value as follows
P=P
≥
=P
≥
√
.
√
P-value = P(Z≥0.06) = 0.2743,
Since p-value = 0.2743 > 0.05, we have no evidence to reject H0. Hence µ = 10.
Comparing Two Independent Means (Independent Data):
Given two independent random samples from two populations with ̅ and ̅ and sample sizes n1 and
n2respectfully such that their respective population parameters are µ1 and µ2,
and , we can test
H0:µ1=µ2against H1:µ1≠µ2 or H1:µ1<µ2 or H1:µ1>µ2under some assumptions about the population.
77
Assumption 1
That
and
are known. ThenZ =
̅
̅
Assumption 2
That
and
̅
Z=
are not known and sample sizes are large. Then,
̅
Assumption 3
That and
lessons
are not known but n1 and n2 are small will provide again two scenarios as seen in previous
Scenario A
When the population variance
t=
̅
Where
and
are assumed equal,
̅
=
(
)
(
)
with
+
− 2 degrees of freedom
Scenario B
That
and
are assumed not equal t =
̅
̅
Whered.f =
Example 3.6
78
Test H0: µ1 = µ2against H0: µ1≠ µ2at 5% significance level when n1 = 100, n2 = 64, ̅ = 10.8, ̅ = 9.6,
= 16 and
= 25
Solution:
We know from our previous discussion thatZ =
=
∝
.
.
= 1.62
= 1.96,
.
Hence we cannot reject the null hypothesis, since Z=1.62 does not fall in the rejection region s shown on
the diagram below.
Rejection Region
Acceptance
Critical/Rejection Region
Region
-1.96
0
1.96
Example 3.7
Test H0: µ1 = µ2 against H1: µ1>µ2 at 10% significance level when n1 = 16, n2 = 10, ̅ = 24.4, ̅ = 18.2,
= 3.5 and = 4.8
Solution:
If we assume that the population variances are equal, then, we find t and the pooled variance (Sp) as follows:
(
=
.
t=
.
.
=
)( . )
. ,
(
)( . )
= 4.04
= 3.19
= 1.318
Since the calculated t-value is greater than the t-critical, we reject the null hypothesis and conclude that µ1>
µ2
79
Comparing two Means (Paired Data)
This concerns observations that occur in pairs. E.g. (
( , ),
,
), (
,
),……………………….,
Thus, observations “before” and “after” an experiment on n individuals. Here, the problem
reduces to a single test.
Example 3.8
The data below are the weights before and after ten students were fed with a weight reducing diet.
1 2 3 4 5 6 7 8 9 10
Before (xi) 69 50 61 72 78 66 75 89 86 54
After (yi) 66 49 63 70 71 65 75 88 87 51
Solution:
d = yi – xi: -3, 2, -2, -7, -1, 0, -1, 1, -3
̅ = −1.5, S = 25
H0: µ = 0 against H1: µ < 0
t=
=
.
.
√
= -1.897
√
tα = t0.05,9 = 1.833
Since the calculated t-value falls in the rejection region, we reject the null hypothesis and conclude
that there is significant reduction in weight of the ten students.
Question 1
The mean length of a small counterbalance is 43mm. There is concern that the adjustments of
the machine producing the bars have changed. Twelve bars were selected at random and their
lengths recorded. The lengths are (in millimeter);
42, 39, 42, 45, 43, 40, 39, 41, 40, 42, 43 and 42. At 0.02 level of
significance,
(a) has there been a statistically significant change in the mean length of the bars?
(b) calculate the confidence interval for the mean.
(c) comment on your results in (a) and (b) above
80
CHAPTER 4
PROCESS CONTROL
Process control is concerned with controlling the quality of goods being manufactured in the
production process. It, infact, controls the quality of the goods to be produced. Process control has
the object of determining whether the production process is going on as desired turning out product
units of a requisite standard. This is achieved through the use of several control charts.
A process is said to be in control when its results in future would be same as they had seen the
past. Technically the process is said to be under control if the means of sample lots, X are within
the control limits around the grand mean X .
The process is said to have gone out of control if there has been a change in the process mean from
the population mean to some other value. A sample average falling outside the control limits
suggests strongly that the process is out of control, that is an assignable cause rather than a chance
cause has created the difference.
Control Charts
Control Charts serve the purpose of alerting those responsible to the possibility that a process is
not working as expected. In doing so, the control charts make use of measurement of a particular
dimension, performance characteristic, or other continuously scaled variable such as the diameter
of a bolt.
Control Charts may also be use on attributes, i.e the fraction defective within sample or simply a
count of the number of defectives in a sample. All control charts are prepared more or less, on the
basis of the same statistical technique.
They are graphic devices for detecting unnatural pattern of variations in data resulting from
repetitive processes. Usually the standards of products are specified to which the quality must
confirm. These standards also specify limits within which the quality of a product must. Thus
there two control limits viz, the upper control limits (UCL) and the lower control limits. (LCL)
81
At regular periodical intervals, random samples are taken and the relevant date plotted on the
graph. If the sample points are within control limits (though they all may not be on the central or
the standard line) then it does not call for any corrective action and the process is said to be under
control.
But if the sample points deviate considerably from the control limits, then the process is said to
have gone out of control and in such a situation, the concerning officer must inspect, examine and
set right the process.
A control chart is a device for recording the said characteristics on a continuous chart in which the
horizontal scale is either time, if samples are taken at regular intervals, or simply batch serial
numbers. It consists of a horizontal central line which represents a mean value of the variable, an
UCL
Variable attribute
Quality scale
UCL and LCL as shown in the following typical control chart:
Central line
(Mean)
LCL
1
2
3
4
5
6
7 8
11 12
9 10
Sample number or batch serial number
13
Once the chart as shown above is prepared then it is possible to indicate on it the values of
succeeding samples, whether these are sample means, sample ranges fraction defective in the
sample or number of defectives. If the points fall within limits, then everything is okay so far as
the given process is concerned but if a point falls outside the limits then there is reason to think
that there is something wrong with the process under consideration.
82
Control charts are generally developed as stated above. A larger number of samples are taken
randomly and the mean value is worked out. The limits are then set in accordance with the
formulae, described below for different control charts.
Control Charts for variables
There are two control charts often used for statistic of variables X-chart and R-chart. Control
charts for variables make use of actual measurements of items in sample of size n, treating a single
variable.
1. X -Chart (Control Chart for mean)
Is constructed as follows.
For each sample, the mean ( X ) is determined as under
X 
 Xi
n
i  1, 2, 3, . . . , n
And the range (R) is worked as R  X max  X min
To establish the chart, ‘k’ samples of size ‘n’ are first taken, and the mean of the sample means,
which serves as the central line, is given by
X 
Xi
k
i  1, 2, 3, . . . , k
It also necessary to compute the mean range i.e. R of the ‘k’ samples as below.
R
 Ri
k
i  1, 2, 3, . . . , k
The control limits of X-chart then are
UCL  X  A2 R
and LCL  X  A2 R
Where A2 are obtained from tables
In case the population mean (  ) and standard deviation (  ) are given, then X -chart is
constructed as under:
The central line of X chart lies at and the control limits are worked out as follows:
83
UCL   
3
n
and LCL   
3
n
Alternatively, the limits can as well be started as
  A
In each case A  3
n
and the value of A which depends upon the sample size ‘n’ can be read
form the tables.
2.R- Chart (Range Chart)
This is designed to ensure that the variability within samples does not exceed specified limits. It
is constructed as follows:
Central line = R and control limits are:
UCL  D 4 R
and LCL  D3 R
Where the coefficients D3 and D4 depend upon sample size ‘n’. They are read from table. If D3 R
comes out to be negative value, then it is taken as zero. In case the standard deviation (  ) is
specifies, then R-chart is constructed as under:
Central line = D2
and the control limits are:
UCL  D2
and LCL  D1
Where  is the value of standard deviation, D1 and D2 are values found from tables
84
Control Charts for attributes.
Control Charts statistics of attributes are prepared when the quality is expressed as either good or
defectives. In case of attributes the following control charts are generally used:
1. fraction defective chart (or the ‘p’ chart)
2. number defective chart (or the ‘np’ chart)
3. number of defects per unit chart (or the ‘c’-chart)
3. p-chart or Fraction defective chart
This is based on binomial distribution and it is constructed as follows
If samples of size n are taken at intervals and in each of them is computed p=c/n
Where c is the number of defective items, the mean fraction defective (p) is used in p-chart in
which central line = p and the control limits are:
UCL  p  3
and LCL  p  3
where p is the mean value of p and must be obtained from a large number of k initial samples. For
each of the pi are determined which leads to
p
 
 pi ; i  1, 2, . . . , k
k
p (1  p )
n
Before using p-chart, it is important to plot on it all the k points that have gone into its preparation
and to eliminate all those points that fall outside the limits, finally recomputingp using only
samples whose value of p has fallen within the limits. It may as well be noted that in order to be
able to use statistical theory underlying such charts, it is essential that n > 60.
4. ‘np’ –chart or number defective chart
It is possible to use the number of defectives in control charts rather than the fraction of defectives
and accordingly one can prepare number defective chart or what is known as the ‘np’-chart. Such
a chart shows the actual number of defectives found in each sample. If the sample size is constant,
then the plotting of the actual number of defectives may be more convenient than the fraction
defectives. There is not much difference between the ‘np’-chart and the ‘p’-chart.
85
5. ‘c’-chart or number or defects per unit chart.
This chart is based on the Poison distribution. It is constructed as follows: ‘c’-chart is used for the
control of the number of defects observed per unit and is useful in many situations in industry. If
constant number of units is inspected per sample and the number of defects found in each sample
is ascertained, then the number of defects may be assumed to follow Poison distribution. When
constructing the chart, k samples each of size n are taken and the number of defects, c, for each
item is determined and then number of defect is obtained as follows.
c
 ci
 ni
i  1, 2, . . . , k
This mean value c is the central line of ‘c’ chart and the control limits are worked out as below:
UCL  c  3 c
LCL  c  3 c
Example 1
A manufacturer of rope selects six samples of five ropes each and test the breaking strength of
each, construct an chart and chart for the data shown in table 2.2
Sample
Breaking Strength (pounds)
1
46
47
45
46
47
2
50
51
52
53
49
3
48
51
50
50
49
4
52
50
49
50
51
5
51
47
46
48
47
6
49
51
50
51
52
Using,
.
=
=
.
.
86
R
.
46.2
2
51.0
4
49.6
3
50.4
3
47.8
5
50.6
3
295.6
20
⋯
.
= 49.3
Also,
⋯
=
=
= 3.33
The mean and the range for each sample are shown in the last two columns of the table. The grand
mean and the mean range are computed. With this information, the UCL and LCL for X can be
determined. Since each sample size is n = 5, Table 3 in Appendix I reveals
to be 0.577. Then:
UCL =
+
= 49.3 + (0.577)(3.33)
= 51.22
LCL =
−
= 49.3 - (0.577)(3.33)
= 47.3
UCL = 51.22
= 47.38
LCL = 47.38
:Figure 2: -Chart for Rope Manufacturer
87
Notice that the mean for subgroup 1 reveal that the process is out of control. The mean have
decreased to a level exceeding the LCL, indicating the presence of assignable cause variation.
Example 4.2
Now, consider the problem faced by Okraku, the director of quality control measures for Taroxy
Industries. His plant produces frames for desktop computers which must meet certain size
specifications. To ensure these standards are met, Okraku collects k = 24 samples (subgroups),
each of size n = 6, and measures their width. The results are reported in the table below:
Sample
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
15.2
16.2
15.6
18.5
17.5
14.3
15.4
18.0
14.2
15.7
14.8
16.8
15.2
15.4
18.4
16.5
15.2
16.8
13.5
19.8
18.7
17.5
14.9
18.7
Sample Measurements
14.5
15.4
16.5
15.9
15.4
15.9
15.2
15.2
16.5
15.9
16.2
15.9
14.8
15.7
15.2
16.8
15.7
14.5
14.2
14.5
15.9
16.5
14.8
15.4
15.2
15.4
15.8
14.2
14.5
14.4
16.2
14.8
15.6
14.5
16.1
15.7
16.5
14.5
14.8
16.8
14.5
16.5
14.9
15.8
15.8
15.2
15.8
15.7
15.9
14.5
15.1
15.9
15.7
16.8
15.3
14.8
15.7
15.9
14.8
15.5
16.8
15
15.7
16.9
16.9
16.8
17
17.1
17.2
18.9
18.5
18.5
17.6
18.7
21.1
17.2
14.5
20.8
19.2
19.2
17.9
18.7
20.8
18.4
18
18.2
2.2
14.2
18.9
20
16.8
16.2
17.9
17.4
18.7
17.2
88
16.2
14.5
16.2
14.2
15.2
14.8
15.7
16.8
15.9
16.1
16.3
16.2
14.7
14.9
14.8
14.7
15.4
18.9
16
18.7
17.5
17.8
18.5
16.5
15.6167
15.4000
16.0500
15.8667
15.2667
15.2833
15.2833
15.7833
15.3333
15.7333
15.4667
15.9167
15.2167
15.4833
15.8500
15.9333
16.4000
18.1333
17.3500
18.7000
18.6667
14.6500
17.5500
17.7333
388.6667
R
2
1.7
0.9
4.3
3.3
2.2
1.6
3.6
1.9
2.3
2
1.6
1.4
2
3.6
2.2
1.9
2.1
7.6
6.3
3.3
6
5.1
2.2
71.1
.
=
Using,
.
⋯
.
.
=
= 16.3194
Also,
=
=
.
.
.
⋯
.
.
= 2.9625
The mean and range for each sample are shown in the last two columns of the table. The grand
mean and the mean are computed. With this information, UCL and LCL for X can be determined.
Since each sample size n = 6, Table C in Appendix I reveals A2 to be 0.483.
Then,
UCL =
+
= 16.3194 + (0.483)(2.9625)
= 17.75
LCL =
−
= 16.3194 - (0.483)(2.9625)
= 14.89
Figure 3 which was produced using SPSS, is the control chart for Okraku. Notice that the mean
subgroups 18, 20, and 21 reveals that the process is out of control: the means have increased to
levels exceeding the UCL, indicating the presence of assignable cause variation.
Perhaps over time the machines producing the computer parts have suffered unusual wear,
resulting in improper performance.
89
Xbar Chart of Frame of Desktop Computers
1
19
Sample Mean
18
UCL=17.991
17
__
X=16.194
16
15
LCL=14.398
14
1
3
5
7
9
11 13
15 17
Sample Measurements
19
21
23
Figure 3
Or the variation might have been caused by introduction of inferior raw materials obtained from
new supplier around the time sample 18 was taken in any event, Okraku must locate and correct
the cause for the unacceptable variation.
Estimating Process Capability:The andR charts provide information about the performance or
process capability of the process. From the chart, we may estimate the mean size of frame for
desktops computers as = 16.3194 . The process standard deviation may be estimated using
equation (2.10); that is,
=
=
.
.
= 1.0957≈ 1.10
where the value of d2for samples of size six is found in Appendix Table VI. The specification
limits on size of frame for desktops computers are 16.32 ± 0.5. Assuming that size of frame for
desktops computers is a normally distributed random variable, with mean 16.3194 andstandard
deviation 1.0957, we may estimate the fraction of nonconforming frame for desktops computers
as
p = P {15.82 <X< 16.82}
=Φ
.
.
.
90
<
<
.
.
.
= Φ(−0.45 <
< 0.45) = 2Φ (0.45)
= 2(0.1736)
= 0.3472
Another way to express process capability is in terms of the process capability ratio(PCR)Cp,
which for a quality characteristic with both upper and lower specification limits (USL and LSL,
respectively) is
=
UCL − LCL
6σ
Note that the 6 spread of the process is the basic definition of process capability. Since σ is
usually unknown, we must replace it with an estimate. We frequently use = /d2 as an estimate
of , resulting in an estimate of ofCp. For size of frame for desktops computers, since = /d2
= 1.0957, we find that
=
.
.
( .
)
= 0.1521
This implies that the “natural” tolerance limits in the process (three-sigma above and below the
mean) are outside the lower and upper specification limits. Consequently, a moderately greater
number of nonconforming frame of desktop computers will be produced. The PCR Cpmay be
interpreted another way. The quantity
=
1
100%
is simply the percentage of the specification band that the process uses up. For the frame of
desktop computers an estimate of P is
=
1
100% =
1
100% = 657.46
0.1521
That is, the process uses up about 657% of the specification band.
Interpretation of
and R Charts
Note that, a control chart can indicate an out-of-control condition even though no single point plots
outside the control limits, if the pattern of the plotted points exhibits nonrandom or systematic
behaviour. In many cases, the pattern of the plotted points will provide useful diagnostic
information on the process, and this information can be used to make process modifications that
reduce variability (the goal of statistical process control). Furthermore, these patterns occur fairly
91
often in phase I (ret- rospective study of past data), and their elimination is frequently crucial in
bringing a process into control. In this section, we briefly discuss interpretation of control charts,
focusing on some of the more common patterns that appear on and R charts and some of the
process characteristics that may produce the patterns. To effectively interpret and R charts, the
analyst must be familiar with both the statistical principles underlying the control chart and the
process itself. Additional information on the interpretation of patterns on control charts is in the
Western Electric Statistical Quality Control Handbook (1956, pp. 149–183).
In interpreting patterns on the chart, we must first determine whether or not the R chart is in
control. Some assignable causes show up on both the andR charts. If both the andR charts
exhibit a nonrandom pattern, the best strategy is to eliminate the R chart assignable causes first. In
many cases, this will automatically eliminate the nonrandom pattern on the chart. Never attempt
to interpret the chart when the R chart indicates an out-of-control condition.
Cyclic patterns occasionally appear on the control chart. Such a pattern on the chart may result
from systematic environmental changes such as temperature, operator fatigue, regular rotation of
operators and/or machines, or fluctuation in voltage or pressure or some other variable in the
production equipment. R charts will sometimes reveal cycles because of maintenance schedules,
operator fatigue, or tool wear resulting in excessive variability.
A Mixture is indicated when the plotted points tend to fall near or slightly outside the control
limits, with relatively few points near the center line.A mixture pattern is generated by two (or
more) overlapping distributions generating the process output. The severity of the mixture pattern
depends on the extent to which the distributions overlap. Sometimes mixtures result from “over
control,” where the operators make process adjustments too often, responding to random variation
in the output rather than systematic causes. A mixture pattern can also occur when output product
from several sources (such as parallel machines) is fed into a common stream which is then
sampled for process monitoring purposes.
A shift in process level. These shifts may result from the introduction of new workers; changes in
methods, raw materials, or machines; a change in the inspection method or standards; or a change
in either the skill, attentiveness, or motivation of the operators. Sometimes an improvement in
process performance is noted following introduction of a control chart program, simply because
of motivational factors influencing the workers.
A trend, or continuous movement in one direction. Trends are usually due to a gradual wearing
out or deterioration of a tool or some other critical process component. In chemical processes they
often occur because of settling or separation of the components of a mixture. They can also result
from human causes, such as operator fatigue or the presence of supervision. Finally, trends can
result from seasonal influences, such as temperature. When trends are due to tool wear or other
systematic causes of deterioration, this may be directly incorporated into the control chart model.
A device useful for monitoring and analyzing processes with trends is the regression control chart,
92
Mandel (1969). The modified control chart, discussed in Chapter 9, is also used when the process
exhibits tool wear.
In interpreting patterns on the andR charts, one should consider the two charts jointly. If the
underlying distribution is normal, then the random variables and R computed from the same
sample are statistically independent. Therefore,
and R should behave independently on the
control chart. If there is correlation between the and R values—that is, if the points on the two
charts “follow” each other—then this indicates that the underlying distribution is skewed. If
specifications have been determined assuming normality, then those analyses may be in error.
Example 3
With reference to Example 1, how is a range chart or R-Chart constructed?
Solution
The first step is to find the mean range, . This is done by initially finding the range, R as found
on the last column of the solution to Example 1. The mean range, , is 3.33. Found by (2 + 4 + 3
+ 3 + 5 + 3)/6 = 20/6 = 3.33. Referring to Appendix.forD3 and D4 and a sample size of 5, D3 = 0
and D4 = 2.115. Determining the lower and upper control limits for the range chart:
LCL = D3
= (0) (3.33) = 0
UCL = D4
= 2.115(3.33)
93
= 7.043
UCL = 7.043
CL = 3.33
Figure 4
Example 4
A quality control inspector at the Cocoa Fizz soft drink company has taken twenty-five samples
withfour observations each of the volume of bottles filled. The data and the computed means are
shown in the table. If the standard deviation of the bottling operation is 0.14 ounces, use this
information to develop control limits of three standard deviations for the bottling operation.
Observations
Sample
Number
Average
Range
(bottle volume in ounces)
1
2
3
4
x
R
1
2
15.85
16.12
16.02
16
15.83
15.85
15.93
16.01
15.91
15.99
0.19
0.27
3
16
15.91
15.94
15.83
15.92
0.17
4
16.2
15.85
15.74
15.93
15.93
0.46
5
15.74
15.86
16.21
16.1
15.98
0.47
6
15.94
16.01
16.14
16.03
16.03
0.2
7
15.75
16.21
16.01
15.86
15.96
0.46
8
15.82
15.94
16.02
15.94
15.93
0.2
9
16.04
15.98
15.83
15.98
15.96
0.21
94
10
15.64
15.86
15.94
15.89
15.83
0.3
11
16.11
16
16.01
15.82
15.99
0.29
12
15.72
15.85
16.12
16.15
15.96
0.43
13
15.85
15.76
15.74
15.98
15.83
0.24
14
15.73
15.84
15.96
16.1
15.91
0.37
15
16.2
16.01
16.1
15.89
16.05
0.31
16
16.12
16.08
15.83
15.94
15.99
0.29
17
16.01
15.93
15.81
15.68
15.86
0.33
18
15.78
16.04
16.11
16.12
16.01
0.34
19
15.84
15.92
16.05
16.12
15.98
0.28
20
15.92
16.09
16.12
15.93
16.02
0.2
21
16.11
16.02
16
15.88
16
0.23
22
15.98
15.82
15.89
15.89
15.9
0.16
23
16.05
15.73
15.73
15.93
15.86
0.32
24
16.01
16.01
15.89
15.86
15.94
0.15
25
16.08
15.78
15.92
15.98
15.94
0.3
398.75
7.17
Total
Central line = D2
= 2.059(0.14 ounces)
= 0.2882 ounces
Alternatively, we can use the mean range to arrive at this same figure, thus;
Central line
.
=
=
.
.
⋯
.
= 0.2882
Upper Control Limit (UCL ) = D R
= 2.282(0.2882)
= 0.6577
95
Lower Control Limit (LCL ) = D R
= 0(0.2882)
=0
UCL = 0.6577
C L = 0.2887
Figure 5
Example 4.5
A production manager at a tire manufacturing plant has inspected the number of defective tires in
twenty random samples with twenty observations each. Following are the number of defective
tires found in each sample: Construct a three-sigma control chart with this information.
96
Number
Sample
Number of
of
Observations
Defective
Sampled
Tires
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
3
2
1
2
1
3
3
2
1
2
3
2
2
1
1
2
4
3
1
1
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
20
+
̅ =
̅ =
.
.
+
.
…
…+
.
+3
= 0.1 + 3
97
Defective
0.15
0.1
0.05
0.1
0.05
0.15
0.15
0.1
0.05
0.1
0.15
0.1
0.1
0.05
0.05
0.1
0.2
0.15
0.05
0.05
2.00
= 0.10
UCL =
Fraction
̅
( . )( . )
= 0.10 +3(.067) = 0.301
LCL
=
̅
−3
= 0.1 - 3
( . )( . )
= -0.799 ≈ 0
UCL = 0.30
CL = 0.1
Figure 6
Example 4.6
Frozen orange juice concentrate is packed in 6-oz cardboard cans. These cans are formed on a
machine by spinning them from cardboard stock and attaching a metal bottom panel. By inspection
of a can, we may determine whether, when filled, it could possibly leak either on the side seam or
around the bottom joint. Such a nonconforming can has an improper seal on either the side seam
or the bottom panel. Set up a control chart to improve the fraction of non con- forming cans
produced by this machine. 30 samples of n = 50 cans each were selected at half-hour intervals over
a three-shift period in which the machine was in continuous operation. The data are shown in Table
7
98
Table 7
Data for Trial Control Limits, Sample Size n = 50
Number of
Number of
Sample Fraction
Nonconforming
Observations
Nonconforming,
Cans, Di
Sampled
pi
1
12
50
0.24
2
15
50
0.3
3
8
50
0.16
4
10
50
0.2
5
4
50
0.08
6
7
50
0.14
7
16
50
0.32
8
9
50
0.18
9
14
50
0.28
10
10
50
0.2
11
5
50
0.1
12
6
50
0.12
13
17
50
0.34
14
12
50
0.24
15
22
50
0.44
16
8
50
0.16
17
10
50
0.2
18
5
50
0.1
19
13
50
0.26
20
11
50
0.22
21
20
50
0.4
22
18
50
0.36
23
24
50
0.48
24
15
50
0.3
25
9
50
0.18
26
12
50
0.24
27
7
50
0.14
28
13
50
0.26
29
9
50
0.18
30
6
50
0.12
Sample
Number
347
6.94
99
̅ =
̅ =
.
+
.
+
.
…
…+
.
= 0.2313
UCL =
̅
+3
= 0.23 + 3
( .
)( .
)
= 0.23 +3(.077) = 0.4102
LCL
=
−3
= 0.23 - 3
̅
( .
)( .
)
= 0.0529
The control chart with center line at and the above upper and lower control limits is shown in Fig.
7.. The sample fraction nonconforming from each preliminary sample is plotted on this chart. We
note that two points, those from samples 15 and 23, plot above the upper control limit, so the
process is not in control. These points must be investigated to see whether an assignable cause can
be determined.
100
UCL = 0.41
CL = 0.23
Figure 7
TRY
Question 1
Ten samples, each of 50 items, were taken from a given production process and the number of
defectives in each Sample was recorded as follows:
1, 1, 1, 2, 0, 4, 2, 0, 2, 2,
Draw the control chart for fraction defective and plat the points on it. Comment regarding the state
of the process.
Question 2
A typist has been given a new electric typewriter in place of the old manual one After a week’s
time the typist while reading proofs find that the number of errors on the last 10 consecutive pages
has been 7, 9, 3, 5, 5, 6, 3, 1, 0, and 0. With joy the typist announces, “I have got my typing under
control again. My object was to get my average errors per page down to zero and I have done it”,
Comment.
101
Question 3
Construct an X – chart and R –chart from the following information and state whether the
concerning process is in control.
Sample
1
2
3
4
5
6
7
8
9
10
X
20
34
45
39
26
29
12
34
37
23
R
23
39
14
5
20
17
21
11
40
10
Question 4
A machine is set to deliver packets of given weights. Ten samples of size 5 each were recorded.
Below are the given relevant data.
Sample
1
2
3
4
5
6
7
8
9
10
Mean( X )
15
17
15
18
17
14
18
15
17
16
Range(R)
7
7
4
9
8
7
12
4
11
5
Calculate the values of the central line and the control limits for mean chart and the range chart
and then comment on the state of control. Conversion factors for n = 5 are A2=0.58 and D4=2.115.
102
Question 5
Six samples of emergency room bills were selected. The data are shown here. Construct and
analyze an chart and an R chart for the data.
Sample
Emergency room bill ($)
1
82
95
86
97
93
2
84
90
99
110 116
3
53
62
43
55
4
97
89
90
100 102
5
88
84
87
87
82
6
91
93
95
99
86
58
Question 6
Five samples of shaft for miniature motors were selected and their diameters measured. The data
(in inches) are shown here. Construct and analyze an chart and an R chart for them.
Sample
Shaft diameters (inches)
1
1.56
1.54
1.55
1.59
2
1.52
1.55
1.50
1.56
3
1.49
1.48
1.51
1.50
4
1.43
1.50
1.56
1.51
5
1.51
1.56
1.49
1.52
103
Question 7
Six main suits are checked for defects and the materials and seams. The number of defects for each
is shown here. Construct and analyze a ̅ chart for the data.
Suit
1
2
3
4
5
6
No. of
Defects
3
6
9
5
7
8
Question 8
Eight samples of water pumps are selected and tested for leaks. Those that leaked are considered
defective. Construct and analyze a ̅ chart the data shown here.
Sample
Size
Number of
defective cans.
1
10
6
2
10
0
3
10
2
4
10
1
5
10
3
6
10
2
7
10
1
8
10
0
104
APPENDIX 1 (STATISTICAL TABLES)
A. STANDARD NORMAL DISTRIBUTION: Table Values Represent AREA to the LEFT of
the Z score.
Z .00
.01
.02
.03
.04
.05
.06
.07 .08
.09
0.0 .50000 .50399 .50798 .51197 .51595 .51994 .52392 .52790 .53188 .53586
0.1 .53983 .54380 .54776 .55172 .55567 .55962 .56356 .56749 .57142 .57535
0.2 .57926 .58317 .58706 .59095 .59483 .59871 .60257 .60642 .61026 .61409
0.3 .61791 .62172 .62552 .62930 .63307 .63683 .64058 .64431 .64803 .65173
0.4 .65542 .65910 .66276 .66640 .67003 .67364 .67724 .68082 .68439 .68793
0.5 .69146 .69497 .69847 .70194 .70540 .70884 .71226 .71566 .71904 .72240
0.6 .72575 .72907 .73237 .73565 .73891 .74215 .74537 .74857 .75175 .75490
0.7 .75804 .76115 .76424 .76730 .77035 .77337 .77637 .77935 .78230 .78524
0.8 .78814 .79103 .79389 .79673 .79955 .80234 .80511 .80785 .81057 .81327
0.9 .81594 .81859 .82121 .82381 .82639 .82894 .83147 .83398 .83646 .83891
1.0 .84134 .84375 .84614 .84849 .85083 .85314 .85543 .85769 .85993 .86214
1.1 .86433 .86650 .86864 .87076 .87286 .87493 .87698 .87900 .88100 .88298
1.2 .88493 .88686 .88877 .89065 .89251 .89435 .89617 .89796 .89973 .90147
1.3 .90320 .90490 .90658 .90824 .90988 .91149 .91309 .91466 .91621 .91774
1.4 .91924 .92073 .92220 .92364 .92507 .92647 .92785 .92922 .93056 .93189
1.5 .93319 .93448 .93574 .93699 .93822 .93943 .94062 .94179 .94295 .94408
1.6 .94520 .94630 .94738 .94845 .94950 .95053 .95154 .95254 .95352 .95449
1.7 .95543 .95637 .95728 .95818 .95907 .95994 .96080 .96164 .96246 .96327
1.8 .96407 .96485 .96562 .96638 .96712 .96784 .96856 .96926 .96995 .97062
1.9 .97128 .97193 .97257 .97320 .97381 .97441 .97500 .97558 .97615 .97670
2.0 .97725 .97778 .97831 .97882 .97932 .97982 .98030 .98077 .98124 .98169
2.1 .98214 .98257 .98300 .98341 .98382 .98422 .98461 .98500 .98537 .98574
2.2 .98610 .98645 .98679 .98713 .98745 .98778 .98809 .98840 .98870 .98899
2.3 .98928 .98956 .98983 .99010 .99036 .99061 .99086 .99111 .99134 .99158
2.4 .99180 .99202 .99224 .99245 .99266 .99286 .99305 .99324 .99343 .99361
2.5 .99379 .99396 .99413 .99430 .99446 .99461 .99477 .99492 .99506 .99520
2.6 .99534 .99547 .99560 .99573 .99585 .99598 .99609 .99621 .99632 .99643
2.7 .99653 .99664 .99674 .99683 .99693 .99702 .99711 .99720 .99728 .99736
2.8 .99744 .99752 .99760 .99767 .99774 .99781 .99788 .99795 .99801 .99807
2.9 .99813 .99819 .99825 .99831 .99836 .99841 .99846 .99851 .99856 .99861
3.0 .99865 .99869 .99874 .99878 .99882 .99886 .99889 .99893 .99896 .99900
3.1 .99903 .99906 .99910 .99913 .99916 .99918 .99921 .99924 .99926 .99929
3.2 .99931 .99934 .99936 .99938 .99940 .99942 .99944 .99946 .99948 .99950
3.3 .99952 .99953 .99955 .99957 .99958 .99960 .99961 .99962 .99964 .99965
3.4 .99966 .99968 .99969 .99970 .99971 .99972 .99973 .99974 .99975 .99976
3.5 .99977 .99978 .99978 .99979 .99980 .99981 .99981 .99982 .99983 .99983
3.6 .99984 .99985 .99985 .99986 .99986 .99987 .99987 .99988 .99988 .99989
3.7 .99989 .99990 .99990 .99990 .99991 .99991 .99992 .99992 .99992 .99992
3.8 .99993 .99993 .99993 .99994 .99994 .99994 .99994 .99995 .99995 .99995
3.9 .99995 .99995 .99996 .99996 .99996 .99996 .99996 .99996 .99997 .99997
105
B. t-DISTRIBUTION TABLE
t distribution critical values
Upper-tail probability p
df.
25 . 20 .
15
.10
.05
1
1.000 1.376 1.963 3.078 6.314
2
0.816 1.061 1.386 1.886 2.920
3
0.765 0.978 1.250 1.638 2.353
4
0.741 0.941 1.190 1.533 2.132
5
0.727 0.920 1.156 1.476 2.015
6
0.718 0.906 1.134 1.440 1.943
7
0.711 0.896 1.119 1.415 1.895
8
0.706 0.889 1.108 1.397 1.860
9
0.703 0.883 1.100 1.383 1.833
10 0.700 0.879 1.093 1.372 1.812
11 0.697 0.876 1.088 1.363 1.796
12 0.695 0.873 1.083 1.356 1.782
13 0.694 0.870 1.079 1.350 1.771
14 0.692 0.868 1.076 1.345 1.761
15 0.691 0.866 1.074 1.341 1.753
16 0.690 0.865 1.071 1.337 1.746
17 0.689 0.863 1.069 1.333 1.740
18 0.688 0.862 1.067 1.330 1.734
19 0.688 0.861 1.066 1.328 1.729
20 0.687 0.860 1.064 1.325 1.725
21 0.686 0.859 1.063 1.323 1.721
22 0.686 0.858 1.061 1.321 1.717
23 0.685 0.858 1.060 1.319 1.714
24 0.685 0.857 1.059 1.318 1.711
25 0.684 0.856 1.058 1.316 1.708
26 0.684 0.856 1.058 1.315 1.706
27 0.684 0.855 1.057 1.314 1.703
28 0.683 0.855 1.056 1.313 1.701
29 0.683 0.854 1.055 1.311 1.699
30 0.683 0.854 1.055 1.310 1.697
40 0.681 0.851 1.050 1.303 1.684
50 0.679 0.849 1.047 1.299 1.676
60 0.679 0.848 1.045 1.296 1.671
80 0.678 0.846 1.043 1.292 1.664
100 0.677 0.845 1.042 1.290 1.660
1000 0.675 0.842 1.037 1.282 1.646
z∗ 0.674 0.841 1.036 1.282 1.645
50% 60% 70%
80% 90%
Confidence level C
.025
12.71
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.021
2.009
2.000
1.990
1.984
1.962
1.960
95%
.02
.01 . 005 .0025 . 001 .
15.89 31.82 63.66 127.3 318.3
4.849 6.965 9.925 14.09 22.33
3.482 4.541 5.841 7.453 10.21
2.999 3.747 4.604 5.598 7.173
2.757 3.365 4.032 4.773 5.893
2.612 3.143 3.707 4.317 5.208
2.517 2.998 3.499 4.029 4.785
2.449 2.896 3.355 3.833 4.501
2.398 2.821 3.250 3.690 4.297
2.359 2.764 3.169 3.581 4.144
2.328 2.718 3.106 3.497 4.025
2.303 2.681 3.055 3.428 3.930
2.282 2.650 3.012 3.372 3.852
2.264 2.624 2.977 3.326 3.787
2.249 2.602 2.947 3.286 3.733
2.235 2.583 2.921 3.252 3.686
2.224 2.567 2.898 3.222 3.646
2.214 2.552 2.878 3.197 3.611
2.205 2.539 2.861 3.174 3.579
2.197 2.528 2.845 3.153 3.552
2.189 2.518 2.831 3.135 3.527
2.183 2.508 2.819 3.119 3.505
2.177 2.500 2.807 3.104 3.485
2.172 2.492 2.797 3.091 3.467
2.167 2.485 2.787 3.078 3.450
2.162 2.479 2.779 3.067 3.435
2.158 2.473 2.771 3.057 3.421
2.154 2.467 2.763 3.047 3.408
2.150 2.462 2.756 3.038 3.396
2.147 2.457 2.750 3.030 3.385
2.123 2.423 2.704 2.971 3.307
2.109 2.403 2.678 2.937 3.261
2.099 2.390 2.660 2.915 3.232
2.088 2.374 2.639 2.887 3.195
2.081 2.364 2.626 2.871 3.174
2.056 2.330 2.581 2.813 3.098
2.054 2.326 2.576 2.807 3.091
96%
98% 99% 99.5% 99.8%
106
0005
636.6
31.60
12.92
8.610
6.869
5.959
5.408
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.850
3.819
3.792
3.768
3.745
3.725
3.707
3.690
3.674
3.659
3.646
3.551
3.496
3.460
3.416
3.390
3.300
3.291
99.9%
C. Table of Control Chart Constants
X- and R-Charts X- and S-Charts
____________________________ ____________________________
n
d2
d3
C4
A2
D3
D4
A3
B3
B4
2 1.128 0.8525 0.7979 1.880 —
3.267 2.659 — 3.267
3 1.693 0.8884 0.8862 1.023 —
2.574 1.954 — 2.568
4 2.059 0.8798 0.9213 0.729 —
2.282 1.628 — 2.266
5 2.326 0.8798 0.9400 0.577 —
2.114 1.427 — 2.089
6 2.534 0.8480 0.9515 0.483 —
2.004 1.287 0.030 1.970
7 2.704 0.8332 0.9594 0.419 0.076 1.924 1.182 0.118 1.882
8 2.847 0.8198 0.9650 0.373 0.136 1.864 1.099 0.185 1.815
9 2.970 0.8078 0.9693 0.337 0.184 1.816 1.032 0.239 1.761
10 3.078 0.7971 0.9727 0.308 0.223 1.777 0.975 0.284 1.716
11 3.173 0.7873 0.9754 0.285 0.256 1.744 0.927 0.321 1.679
12 3.258 0.7785 0.9776 0.266 0.283 1.717 0.886 0.354 1.646
13 3.336 0.7704 0.9794 0.249 0.307 1.693 0.850 0.382 1.618
14 3.407 0.7630 0.9810 0.235 0.328 1.672 0.817 0.406 1.594
15 3.472 0.7562 0.9823 0.223 0.347 1.653 0.789 0.428 1.572
16 3.532 0.7499 0.9835 0.212 0.363 1.637 0.763 0.448 1.552
17 3.588 0.7441 0.9845 0.203 0.378 1.662 0.739 0.466 1.534
18 3.640 0.7386 0.9854 0.194 0.391 1.607 0.718 0.482 1.518
19 3.689 0.7335 0.9862 0.187 0.403 1.597 0.698 0.497 1.503
20 3.735 0.7287 0.9869 0.180 0.415 1.585 0.680 0.510 1.490
21 3.778 0.7272 0.9876 0.173 0.425 1.575 0.663 0.523 1.477
22 3.819 0.7199 0.9882 0.167 0.434 1.566 0.647 0.534 1.466
23 3.858 0.1759 0.9887 0.162 0.443 1.557 0.633 0.545 1.455
24 3.895 0.7121 0.9892 0.157 0.451 1.548 0.619 0.555 1.445
25 3.931 0.7084 0.9896 0.153 0.459 1.541 0.606 0.565 1.435
107
Download