Uploaded by maybelle del rosario

Engineering Data Analysis

advertisement
COUNTING TECHNIQUES
Multiplication Rules - If an operation can be performed in n1 ways, and if for each of these a second
operation can be performed in n2 ways, and for each of the first two a third operation can be
performed in n3 ways, and so forth, then the sequence of k operations can be performed in n1 · n2 · · ·
nk ways.
Example: The design for a website is to use one of the four colors, a font from among three,
and three different positions for an image. Calculate the number of web designs possible.
Solution: From the multiplication rule, 4 × 3 × 3 = 36 web designs are possible.
Permutation –
1. Permutation of Distinct Elements - The number of permutations of n different elements is n!,
where n! = n(n − 1)(n − 2)· · · 3 · 2 · 1
Example: A food company has four different recipes for a potential new product and wishes to
compare them through consumer taste tests. In these tests, a participant is given the four types of
food to taste in a random order and is asked to rank various aspects of their taste. How many
different rankings of the four types of food are possible?
Solution: The four types of food are to be ranked (ordered sequence) by a participant. A
permutation of the four types gives 4! = 4 × 3 × 2 × 1 = 24 different rankings.
2. Permutation of Subsets - The number of permutations of n distinct objects taken r at a time
is nPr = n! / (n − r)!
Example: In one year, three awards (research, teaching, and service) will be given to a class of 25
graduate students in a statistics department. If each student can receive at most one award, how
many possible selections are there?
Solution: Since the awards are distinguishable, it is a permutation problem. The total number of
sample points is 25P3 = 25! / (25 − 3)! = 25! / 22! = 25 · 24 · 23 = 13,800
3. Circular Permutation - The number of permutations of n objects arranged in a circle is (n −
1)!
Example: Eight people are to be seated around a dining table. How many different arrangements are
possible a) with no restrictions; b) if two people insist on sitting next to each other?
Solutions:
a) The total number of arrangements is (8 − 1)! = 7! = 5040
b) We solve this in several steps. The first step is to count the two as one group.
There are 7 units (6 individuals, 1 group) to be arranged around the table. According to the
circular permutation rule, there are (7 − 1)! = 6! = 720 different arrangements. The second
step is to count the number of arrangements of the people within the group. In this case, the
number of ways is 2! = 2. According to the multiplication rule, the total number of
arrangements is 720 × 2 = 1440
4. Permutation of Similar Objects - The number of permutations of n = n1 + n2 + · · · + nr objects
of which n1 are of one type, n2 are of a second type, . . ., and nr are of an r th type is n! / n1!n2! ·
· · nr!
Example: Code 39 is a common bar code system that consists of narrow and wide bars (black)
separated by either wide or narrow spaces (white). Each character contains nine elements (five
bars and four spaces). The code for a character starts and ends with a bar (either narrow or wide)
and a (white) space appears between each bar. The original specification (since revised) used
exactly two wide bars and one wide space in each character. For example, if b and B denote narrow
and wide (black) bars, respectively, and w and W denote narrow and wide (white) spaces, a valid
character is bwBwBW bwb (the number 6). One character is held back as a start and stop delimiter.
How many other characters can be coded by this system? Can you explain the name of the system?
Solution: The four white spaces occur between the five black bars. In the first step, focus on the
bars. The number of permutations of five black bars when two are B and three are b is 5! / 2!3! =
10. In the second step, consider the white spaces. A code has three narrow spaces w and one wide
space W so there are 4! / 3!1! = 4, possible locations for the wide space. Therefore, the number of
possible codes is 10×4 = 40. If one code is held back as a start/stop delimiter, then 39 other
characters can be coded by this system (and the name comes from this result).
Partitions – The number of ways of partitioning a set of n objects into r cells with n1 elements in the
first cell, n2 elements in the second, and so forth, is (𝑛
𝑛
1 ,𝑛 2 ,…,𝑛 𝑟
) = n! / n1!n2! · · · nr!
Example: In how many ways can 7 graduate students be assigned to 1 triple and 2 double hotel
rooms during a conference?
7
Solution: The total number of possible partitions would be (3,2,2
) = 7! / 3!2!2! = 210
Combination –
Example: A mother-participant samples eight food products and is asked to pick the best, the
second best, and the third best. She buys the three products she likes best to take home as
pasalubong for the family a) How many different rankings are possible? b) How many different
pasalubong are possible?
Solution:
a) Only three of the eight products were ranked. Thus, the number of ways to rank
them is 8P3 = 8! / 5! = 336
b) Since her three best choices are now reclassified as pasalubong, the other 5
products are considered not pasalubong. This is a partition into two cells, or a combination. The
number of ways of choosing a pasalubong of three items is 8C3 = 8! / 3!5! = 56
DISCRETE: RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTIONS
A random variable is a function that assigns a real number to each outcome in the
sample space of a random experiment.
Example: Two balls are drawn in succession without replacement from an urn containing 4 red balls
and 3 black balls. If the random variable Y is the number of red balls, then
Solution: Outcome; y RR 2 RB 1 BR 1 BB 0
Probability Mass Functions – For a discrete random variable X with possible values x1, x2, . . . , xn, a
probability mass function is a function f(x) such that (1) f(xi) ≥ 0 (2) ∑𝑛𝑖=1 𝑓(𝑥𝑖 ) = 1 (3) f(xi) = P[X
= xi ]
Example: Let the random variable Y denote the number of semiconductor wafers that need to be
analyzed in order to detect a large particle of contamination. Assume that the probability that a
wafer contains a large particle is 0.01 and that the wafers are independent. Determine the
probability distribution of Y .
Solution: Let c denote a wafer in which a large particle is present, and let a denote a wafer in which
it is absent. The sample space of the experiment is infinite, and it can be represented as all possible
sequences that start with a string of a’s and end with c. That is, S = {c, ac, aac, aaac, aaaac, . . .}
Consider a few cases. We have P[Y = 1] = P[c] = 0.01. Also, using the independence assumption, P[Y
= 2] = P[{ac}] = 0.99(0.01) = 0.0099.
CUMULATIVE DISTRIBUTION FUNCTIONS
The cumulative distribution function of a discrete random variable X, denoted as
F(x), is F(x) = P[X ≤ x] = ∑𝑥 ≤𝑥 P[𝑋 = 𝑥𝑖 ] . The cumulative distribution function F(x) of a discrete
random variable satisfies the following properties: (1) F(x) = ∑𝑥 ≤𝑥 𝑓(𝑥𝑖) (2) 0 ≤ F(x) ≤ 1 (3) If a ≤ b
then F(a) ≤ F(b).
𝑖
𝑖
Example: Determine the probability mass function of X from the cumulative distribution function:
0, 𝑥 < −2
0.2, −2 ≤ 𝑥 < 0
𝑓(𝑥) = {
0.7, 0 ≤ 𝑥 < 2
1, 2 ≤ 𝑥
Solution: The domain of the probability mass function are the included endpoints of each interval, xi
= −2, 0, 2. The value of f(x) at each xi is determined by f(xi) = F(xi) − F(xi−1) for i = 1, 2, 3 and f(x1) is
taken to be equal to F(x1).
f(x1) = f(−2) = F(x2) = F(0) = 0.2
f(x2) = f(0) = F(x2) − F(x1) = F(0) − F(−2) = 0.7 − 0.2 = 0.5
f(x3) = f(2) = F(x3) − F(x2) = F(2) − F(0) = 1 − 0.7 = 0.3
0.2, 𝑥 = −2
0.5, 𝑥 = 0
Therefore, 𝑓(𝑥) = {
f(xi) is the difference between values of F(x) at consecutive subintervals,
0.3, 𝑥 = 2
and the xi ’s are the left endpoints of each subinterval.
EXPECTED VALUES OF RANDOM VARIABLES
The mean or expected value of the discrete random variable X with probability mass
function f(x), denoted as µX of E[X] is µX = E[X] = ∑𝑎𝑙𝑙 𝑥 𝑥𝑓(𝑥).
Example: A salesperson for a medical device company has two appointments on a given day. At the
first appointment, he believes that he has a 70% chance to make the deal, from which he
can earn $1000 commission if successful. On the other hand, he thinks he only has a 40%
chance to make the deal at the second appointment, from which, if successful, he can make
$1500. What is his expected commission based on his own probability belief? Assume that
the appointment results are independent of each other.
Solution: Let Y denote the total commission of the salesperson in the appointments. The table below
summarizes his total commission and the associated probabilities in parentheses.
His expected commission is µ = E[Y ] = 2500(0.28) + 1000(0.42) +
1500(0.12) + 0(0.18) = 1300
THE BINOMIAL DISTRIBUTION
Many processes can be thought of as consisting of a sequence of Bernoulli trials, such as, for
example, the repeated tossing of a coin or the repeated examination of objects to determine
whether or not they are defective. In such cases, a random variable of interest is the number
of successes obtained within a fixed number of trials n, where a success is defined in an
appropriate manner. Such a random variable is called a binomial random variable. If the binomial
random variable X is the number x of trials that result in a success in a Bernoulli process having n
trials, the probability mass function of X is f(x) = (𝑛𝑥)px(1 − p)n-x,
x = 0, 1, 2, . . . , n.
The binomial random variable is probably the most important of all discrete probability
distributions. Its probability distribution is called a binomial distribution. The mean µ and
variance σ 2 of the binomial random variable X with parameters n and p, the number of trials and
the probability of a success, respectively, are µ = E[X] = np & σ2 = E[X] = np(1 − p).
Example: Each sample of water has a 10% chance of containing a particular organic pollutant.
Assume that the samples are independent with regard to the presence of the pollutant. a) Find the
probability that in the next 18 samples, exactly 2 contain the pollutant. b) Find the probability that 3
to 5 of the 20 samples contain the pollutant. c) Find the mean and standard deviation of the number
of pollutants in 16 samples.
Solutions:
a) Let X be the number of samples that contain the pollutant in the next 18 samples
analyzed. X is a binomial random variable with p = 0.1 and n = 18. Therefore, P[X = 2] = (82)(0.1)2
(0.9)16 = 0.2835.
b) The required probability is P[3 ≤ X ≤ 5].
P[3 ≤ X ≤ 5] = P[X = 3] + P[X = 4] + P[X = 5]
= (20
)(0.1)3 (0.9)17 + (20
)(0.1)4 (0.9)16 + (20
)(0.1)5 (0.9)15 = 0.3118
3
4
5
c) µ = np = 16(0.1) = 1.6
σ = √ σ2 = √ np(1 − p) = √ 16(0.1)(0.9) = 1.2
THE POISSON DISTRIBUTION
The number X of outcomes occurring during a Poisson experiment is called a Poisson random
variable, and its probability distribution is called the Poisson distribution.
The probability mass function of the Poisson random variable X, representing the number of
outcomes occurring in a given time interval or specified region denoted by t, is f(x) = e −λt(λt)X/ x! ,
x = 0, 1, 2, . . . where λ is the average number or outcomes per unit time, distance, area, or volume.
Example: Ten is the average number of oil tankers arriving each day at a certain port. The facilities
at the port can handle at most 15 tankers per day. a) What is the probability of finding 8 oil tankers
on a given day? b) What is the probability that on a given day tankers have to be turned away?
Solution:
a) We are given λ = 10 (oil tankers per day), so we take t = 1 (day). f(x = 8) = e
−10108 / 8! = 0.1126
b) Tankers will be turned away if the number of tankers exceed the port’s capacity
of 15. Thus, the probability we seek is P[X > 15].
P[X > 15] = 1 − P[X ≤ 15]
= 1 − ∑15
𝑥=0
𝑒 −10 10𝑥
𝑥!
= 1 − 0.9513 = 0.0487
CONTINUOUS RANDOM VARIABLES AND THEIR PROBABILITY DISTRIBUTION
Continuous Probability Distributions - Physical quantities such as time, length, area, temperature,
pressure, load, intensity, etc., when they need to be described probabilistically, are modeled by
continuous random variables.
A continuous random variable is a function whose range is an interval of real numbers. When a
sample space has an infinite number of sample points, the associated random variable is continuous
with its values distributed over one or more intervals on the real number line.
The function f(x) is a probability density function of the continuous random variable
∞
X defined over the set of real numbers if a) 𝑓(𝑥) ≥ 0 ; b) ∫−∞ 𝑓(𝑥)dx = 1 ; c) P[𝑎 ≤ 𝑥 ≤ 𝑏] =
𝑏
∫𝑎 𝑓(𝑥) 𝑑𝑥
Example: Consider the function 𝑓(𝑥) =
1 2
𝑥 ,
{3
−1 < 𝑥 < 2
. a) Show that it is a probability density
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
function of some continuous random variable X. b) Determine P[0 < X ≤ 1].
Solution:
1
a) We show that Properties (1) and (2) are satisfied. 1) Clearly, 3 𝑥 2 ≥ 0 for all real
1
1
number x. 2) We must show that ∫−∞ 𝑓(𝑥)dx = 1, ∫−∞ 3 𝑥 2 dx = ∫−1 3 𝑥 2 dx = 1
∞
∞
2
11
b) P[0 < X ≤ 1] = ∫0 3 𝑥 2 dx = 1/9
If X is a continuous random variable with probability density function f(x), the cumulative
𝑥
distribution function F(x) is defined as 𝐹(𝑥) = P[𝑋 ≤ 𝑥] = ∫−∞ 𝑓(𝑡)𝑑𝑡.
The cumulative distribution function has the following properties: 1) lim 𝐹(𝑥) = 0 2)
lim 𝐹(𝑥) = 1 3) P[𝑎 < 𝑋 ≤ 𝑏] = 𝐹(𝑏) − 𝐹(𝑎)
𝑥→+∞
𝑥→−∞
Example: Suppose that for some continuous random variable X, F(x) = 1 / 80 (x4 − 1) for 1 ≤ x ≤ 3. a)
What is the probability that X assumes a value between 1.2 and 2.6? b) Find the density function
and use it to compute P[1.2 < X < 2.6].
Solution:
a) We apply Property 3 to compute P[1.2 < X < 2.6]. P[1.2 < X < 2.6] = F(2.6) − F(1.2)
= 1/ 80 (2.64 − 1) − 1/ 80 (1.24 − 1) = 0.5453
b) f(x) = F’ (x) = 1 / 80 (4x3 ) = 1 / 20 x3, P[1.2
0.5453
1
2.6
< 𝑋 < 2.6] = ∫1.2
20
𝑥3 dx =
EXPECTED VALUES OF CONTINUOUS RANDOM VARIABLES
Let X be a continuous random variable with density f(x). The mean or expected value of X, denoted
∞
µX or E[X], and variance of X, denoted σ2X or V[X] are defined as µX = E[X] = ∫−∞ 𝑥𝑓(𝑥)dx & σ2X = V[X] =
∞
∫−∞(𝑥 − µ𝑋)2 𝑓(𝑥)dx
1
Example: Consider the function 𝑓(𝑥) = {3
𝑥 2 , −1 < 𝑥 < 2
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
random variable.
1
3
2
2 1
, Compute the mean and variance of the
1
3
2
Solution: µ = ∫−1 𝑥 . 𝑥2 dx = ∫−1 𝑥3 dx = 5/4, σ2 =∫−1 𝑥 2 . 𝑥2 dx - µ2 = 51/80
3
NORMAL DISTRIBUTION
Undoubtedly, the most widely used model for a continuous measurement is a normal random
variable and its distribution, normal distribution, is the most important continuous probability
distribution.
Example: The time X until recharge for a battery in a laptop computer under common conditions is
normally distributed with µ = 260 minutes and σ = 50 minutes. Find the probability that a fully
charged laptop lasts a) anywhere from 3 to 4 hours; b) longer than 3 hours; c) less than 270
minutes; d) longer than 300 minutes.
240
a) P[180 < X < 240] = ∫180
Solution:
1
50√2𝜋
𝑒
−
1
(𝑥−260)2
2.502
𝑑𝑥 = 0.2898
b) We use the symmetric property of the curve, P[−∞ < X < µ] = P[µ < X < ∞] = 0.5.
∞
P[X > 180] =∫180 50
∞
1
∫260 50√2𝜋 𝑒
1
𝑒
2𝜋
−
√
1
−
(𝑥−260)2
2.502
1
(𝑥−260)2
2.502
270
270
1
𝑒
2𝜋
50√
−
1
(𝑥−260)2
2.502
1
300
1
𝑒
50√2𝜋
𝑒
2𝜋
−
50√
1
(𝑥−260)2
2.502
𝑑𝑥 +
−
1
(𝑥−260)2
2.502
260
𝑑𝑥 = ∫−∞
1
𝑒
2𝜋
−
50√
1
(𝑥−260)2
2.502
𝑑𝑥 +
𝑑𝑥 = 0.5793
∞
1
−
(𝑥−260)2
2.502
𝑒
2𝜋
50√
d) P[X > 300] = ∫300 50
∫260
1
𝑑𝑥 = 0.9452
c) P[X < 270] = ∫−∞
∫260
260
𝑑𝑥 = ∫180
1
𝑒
2𝜋
√
𝑑𝑥 = 0.2119
−
1
(𝑥−260)2
2.502
∞
𝑑𝑥 = ∫260 50
1
𝑒
2𝜋
√
−
1
(𝑥−260)2
2.502
𝑑𝑥 −
EXPONENTIAL DISTRIBUTION
The random variable X, the distance between successive events from a Poisson process with mean
number of events λ > 0 per unit distance is an exponential random variable with parameter λ. The
probability density function of X is f(x) = λe−λx
µ = 1 /λ & σ2 = 1 / λ2
Example: The lifetime of a mechanical assembly in a vibration test is exponentially distributed with
a mean of 400 hours. a) What is the probability that an assembly on test fails in less than 100
hours? b) What is the probability that an assembly operates for more than 500 hours before
failure?
Solution:
a) Let X be the time before a mechanical assembly in a vibration test fails, measured
in hours. The mean lifetime is µ = 400 hours. P[X < 100] = 1 − F(100) = 0.2212
1
b) P[X > 500] = 𝑒 −400(500)= 0.2865
JOINT PROBABILITY DISTRIBUTION
Download