Continuous Random
Variables
-3 -2 -
 + +2 +3
Lecture 4
© 2010, All Rights Reserved, Robi Polikar.
No part of this presentation may be used without explicit
written permission. Such permission will be given – upon
request – for noncommercial educational purposes only.
Limited permission is hereby granted, however, to post or
distribute this presentation if you agree to all of the
following:
1. you do so for noncommercial educational purposes;
2. the entire presentation is kept together as a whole,
including this entire notice.
3. you include the following link/reference on your site:
Robi Polikar, http://engineering.rowan.edu/~polikar.
ECE 09.360
Dr. P.’s
Clinic Consultant Module in
Probability & Statistics
in Engineering
Today in P&S
-3 -2 -
 + +2 +3
 Review of Discrete Random Variables
 Binomial distribution
 Hypergeometric and negative binomial distributions
 Poisson distribution
 Continuous Random Variables and Their Probability Distributions
 Probability density (distribution) function
 Cumulative distribution function
 Percentiles, Expected Values & Variances of Cont. Random Variables
 The Normal (Gaussian) Distribution
 Standard & non-standard normal distribution
 The normal approximation to the binomial distribution
 Other continuous distributions
 Gamma, Beta, Exponential, Chi-Squared and Weibull Distributions
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Random Variables
-3 -2 -
 + +2 +3
 A random variable X is a function that maps every possible event in the space S of a random
experiment to a real number.
5
x
Other
3
Sword
Seabass 4
2
Trout
0.2
0.15
0.1
1
 A cumulative distribution function
tells us the probability of X assuming a
value x or less: F(X)=P(X<x)
0.4
Salmon
 A probability mass (distribution)
function is a function that tells us the
probabilities of each observation of X,
assuming a specific value P(X=x).
Fish
p(x)
F(x)
1
0.6
 The expected value of a random
variable is its mean computed as
5
Other
3
Sword
Seabass 4
2
Trout
Salmon 1
0.4
0.3
0.15
x
E( X )   X 
 x  p ( x)
xD
 The variance of a RV is the average
 2  E  X   2 
2
squared distance of the data from the mean
   x     p x 
D
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Bernoulli Experiments &
Associated Distributions
-3 -2 -
 + +2 +3
 Each individual trial in a series of n
binary outcome (S/F) experiments,
where the probability of success p
is constant and trials are independent is
a Bernoulli Experiment (B.E.).
If the samples are selected without
replacement, we obtain the
hypergeometric distribution.
 If a B.E. continues until a total of r
successes are observed, r.v. the
number of failures that precede
the r th success is negative bin. dist.
 If r = 1, then above r.v. is
equivalent to the number of trials
required to observe the first
success, which has a geometric
distribution. g x; p   p1  p x
 The r.v. defined as the number of
 n
n x
bx; n, p     p x 1  p 
successes among n trials of a Bernoulli
 x
experiment is a binomial r.v.. The
probability of observing x successes in n
trials is given by the Binomial dist.
E(X) = μ= n.p
σ2 = np(1 – p) = npq
X = the number of S’s in a random sample of size n
drawn from a population consisting of M S’s and (N – M)
F’s is given by the HG dist.
 M  N  M 
 x  r  1 r
 p 1  p x
nbx; r , p   
 r 1 
 

x
n

x

h( x; n, M , N )   
N
 
n
 If a r.v. describes the number of a rare event happening
over a period of time / area with an average rate of λ, the r.v.
has a Poisson distribution, which is the limiting case of a
binomial distribution where n →∞ and p→0 while np →λ .
e   x
p  x,   
x!
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Continuous
Random Variables
-3 -2 -
 + +2 +3
 A r.v. X is continuous, if its set of possible outcomes include an entire
interval, say between a and b. Therefore, a continuous r.v. may assume an
infinite number of possible values
 The actual wattage of a 60W bulb, actual speed of a 3GHz chip
 The weights / heights of students in a population
 The exact point on which a spinning pointer lands (but not the number on which
the roulette ball lands, why not?)
 Note that in each case, the r.v. may assume any one of infinite number of values!
© 2010 All Rights Reserved, Robi Polikar, Rowan University
PDF for
Cont. Distributions
-3 -2 -
 + +2 +3
 Let X=the actual speed of a 3 GHz chip. For simplicity, let us assume that the actual
speed of 3 GHZ chip stays within the interval [2.5 3.5] GHz.
 Then X may assume any number between 2,500,000,000 Hz and 3,500,000,000 Hz.
 Now assume we have a device that can measure the actual speed to the nearest 1 MHz. We
measure speeds of a certain number of chips and plot its histogram (a)
 We then get a device that can measure the actual speed to the nearest 1 kHz. We again
measure the speeds and plot the histograms…(b)
 Continuing like this, in the limiting case, if we had a device that could measure the speed
precisely (is it possible..?) and plot the histogram, we would obtain the smooth histogram in
(c). Such a histogram in the limit case, defines a continuous probability distribution.
 Now, what is the probability that a chip’s actual speed is 3 GHz?
How about P(speed=3.023987423452974562347652937456293475629347524752693475GHz)=?
0.323
0.35
2.5
3
(a)
3.5
0.3
2.5
3
(b)
3.5
2.5
3
(c)
3.5
© 2010 All Rights Reserved, Robi Polikar, Rowan University
PDF & CDF
-3 -2 -
 + +2 +3
 Unlike the pmf in discrete case, in continuous domain, the points on the pdf do not
represent the probabilities, as the probability of any single value on the continuous
axis is zero, rather, the pdf represent densities.
 P(speed=3.2398742345129745623476529374562934756293475 GHz)=0
 or P(weight = 175.246541635465468748764546879876523543)=0 or P(weight=175=0;
 Therefore, we can only compute the probabilities that lie within an interval, say a
and b. The actual probability that a r.v. lies between two values is given by the area
under the pdf curve between those two points: pdf
b
f(x)
P ( a  x  b) 
 f ( x)dx
800
700
600
500
400
300
200
2.5
2.9 3 3.1
3.5
Actual speed of a 3GHz chip (GHz)
100
0
xa
Weights of NJ residents (lbs)
© 2010 All Rights Reserved, Robi Polikar, Rowan University
PDF for Continuous
Distributions
-3 -2 -
 + +2 +3
 Let X be a continuous r.v. Then a probability distribution or probability density
function (pdf) of X is a function f (x) such that for any two numbers a and b,
b
Pa  X  b    f x dx
a
the graph of f(x) is then called the density curve.
 For f (x) to be a pdf
1. f (x) > 0 for all values of x.
2. The area of the region between the graph of f and the x – axis is equal to 1.
y  f ( x)
P   X    

 f x dx  1

Area = 1
© 2010 All Rights Reserved, Robi Polikar, Rowan University
The actual probability is the area under the PDF…
I repeat…The Actual Probability is …
-3 -2 -
 + +2 +3
P(a  X  b) is the area of the shaded region.
y  f ( x)
a
b
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Uniform Distribution
-3 -2 -
 + +2 +3
 If all outcomes of a random experiment are equally likely, the random
variable is then said to have a uniform distribution. More formally,
 A continuous r.v. X is said to have a uniform distribution on the interval [a, b] if
the pdf of X is
 1
a xb

f x; a, b    b  a
 0
otherwise
f(x)
1
ba
a
b
x
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Cumulative Distribution
Function
-3 -2 -
 + +2 +3
 Similar to the discrete case, the cumulative distribution function, F(x) for a
continuous rv X is defined for every number x as the P(X ≤ x) by
F  x   P X  x  
x
 f t dt

 Note that for each x, F(x) is the area under the density curve to the left of x.
f(x)
F(x)
F (200)  P X  200
200

 f x dx
0
 Conversely, the pdf can be obtained from the cdf as
f x   F ' x  
df x 
dx
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Cumulative Distribution
Function
-3 -2 -
 + +2 +3
f(x)
F(x)
P100  X  200 
200
 f x dx
100
 F (200)  F (100)
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Percentiles in
Cont. Distributions
-3 -2 -
 + +2 +3
 Percentiles indicate the relative standing in ordered data: For example, if we are talking about
SAT scores, and you are in the 90th percentile, then 90% of all test takers did more poorly
then you did, and 10% did better. So 100pth percentile , ), 0 ≤ p ≤ 1, is the value that exceeds
100p% of all scores, and exceeded by 100(1-p)% of all scores. More formally:
 Let p be a number between 0 and 1. The (100p)th percentile of the distribution of a
continuous rv X denoted by  ( p) , is defined by
p  F  ( p)   
 ( p)

f ( y)dy
 Thus on a pdf curve, η(p) is the value on the horizontal axis such that 100p% of the area
under f(x) lies to the left of η(p) and 100(1-p)% lies to the right!
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Parameters of Cont. RVs:
Mean & Median
-3 -2 -
 + +2 +3
 The median of a continuous distribution, denoted by  , is the 50th
percentile. So  satisfies 0.5  F ( ). That is, half the area under the
density curve is to the left of  .
 The mean, or expected value, of a cont. r.v. is defined similar to its discrete
counterpart, where the summation is replaced by integration:

 x  E  X    x  f x dx

 Often we wish to compute the expected value of some function h(X) of the
r.v. X. Simply think of h(X) as a new r.v. with the pdf f(x) and hence

 h  X   Eh X    hx   f x dx

© 2010 All Rights Reserved, Robi Polikar, Rowan University
Variance of a
Continuous rv.
-3 -2 -
 + +2 +3
 The variance and standard deviation of a continuous rv are also defined
similar to their discrete counterparts, with summations replaced by integrals:

   X     f xdx
 X2  E  X    
 
2

2

 E X 2  E  X 
2
 X   X2
© 2010 All Rights Reserved, Robi Polikar, Rowan University
The Normal
(Gaussian) Distribution
-3 -2 -
 + +2 +3
 By far the most important distribution in all of probability and statistics, because
 the most commonly occurring in nature
• It provides a good explanation for many, but not all, continuously valued phenomena
• Physical measurements of length, weight, width, etc., measurement errors, exam scores,
quality control results, outcomes of medical diagnostic tests, many financial indicators…
• Even if the individual variables of an experiment are not normal, their sum is (CLT)
• Even if individual factors affecting an experiment outcome are not normal, their
combination that determine the actual experiment outcome is normal !!!
• Entirely determined by just two parameters. Knowing them means knowing everything!
• Well studied and well understood
• Has a nice bell shape to it…!
© 2010 All Rights Reserved, Robi Polikar, Rowan University
The Normal Distribution
-3 -2 -
 + +2 +3
 A continuous r.v. X is said to have a normal distribution with parameters μ and σ (or σ2),
where -∞<μ< ∞ and σ>0, if the pdf of X is
1
f x;  ,   
e
2 

 x   2
2
f ( x) 
1
e
 2
1  x 
 

2  
2

68.2%
2
4
95.4%
95.4%
6
99.7%
-3
-2
-

99.7%
+
+2
+3
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Computing
Normal Distribution
-3 -2 -
 + +2 +3
 As in other distributions, to compute the probability of a random variable assuming
a particular range values, we need to integrate the area under the normal curve.
 For example, if the weight of students in this class are normally distributed (probably true)
with a mean of say – 170 lbs and a std. dev. of 20 lbs – then the probability of a randomly
selected student to have a weight between 185 and 200 lbs would be:
200
P(185  x  200) 
 f ( x)dx
x 185
200


x 185
1
2 20

e
 x 1702
202
dx
 0.1598  16%
185 200
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Computing Normal
Distributions
-3 -2 -
 + +2 +3
 The problem however, the integral of the function exp(-x2) cannot be computed
analytically! Instead, the integral is computed numerically for a range of values and
the results are tabulated.
 However, tabulating integrals for every possible value of μ and σ is impossible…!
Therefore we define the standard normal distribution as
 The normal distribution with parameter values μ = 0 and σ =1 is called a
standard normal distribution. The random variable for this distribution is
typically denoted by Z. The pdf is therefore
1 z2 2
f ( z;0,1) 
e
2
 The associated cdf function for Z, typically denoted by Φ(z) is
z   P( Z  z ) 
z


f  y;0,1dy 
z


1  y2 2
e
dy
2
 We therefore list the tables for std. norm. dist. only. Although the standard norm. dist.
does not occur too often, it is very commonly used as a reference distribution. It is
straightforward to convert a nonstandard normal dist. to and from standard dist.
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Using Gaussian Tables
-3 -2 -
 + +2 +3
Area under the curve on each side of zero is 0.5. The curve is symmetric, so the total area is 1
Area B= Φ(0.82)
BA
C
Example: if z=0.82 
A=Area under the curve for [0 0.82] : 0.294
B=Total area for [-∞ 0.82]=0.5+0.294=0.794 = Φ(0.82)
This value is the probability that z<0.82
C= Total area for [0.82 ∞]=1-B=0.5-A=0.206=1- Φ(0.82)
This value is the probability that z>0.82
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Using Gaussian Tables
-3 -2 -
 + +2 +3
 In some books, the standard cdf Φ(Z) function is tabulated, rather then the pdf from 0 to z
P(-0.44<Z<0.82)=Φ(0.82)- Φ(-0.44)=0.7939-0.3300=0.4639
© 2010 All Rights
-0.44 Reserved,
0.82 Robi Polikar, Rowan University
-3 -2 -
Yeah…but what if my μ and σ
are not 0 and 1 …?
 + +2 +3
 It is very simple to convert a nonstandard normal pdf to a standard pdf. The
procedure which simply includes the following substitution is called
standardization.
 If X has a normal distribution with mean μ and std.dev. σ (denoted as N(μ, σ)),
then the variable obtained by (x- μ)/ σ has a standard normal distribution N(0,1)
Z
X 
The z-value obtained from this normalization is often called the
z-score. It represents the number of std. dev.’s the value “x” is
away from its mean

 Once the substitution is made, the probabilities will remain identical:
• Recall our example of class weight distribution being N(170,20), for which we
computed P(185<X<200) as 0.1598. you can also reach this solution by
x
200  170
 1.5

20
x   185  170
z185 

 0.75

20
z200 

Then P(185<X<200) = P(0.5<Z<1.5)
 Φ(1.5)-Φ(0.75)=0.9332-0.7734=0.1598
(200 is 1.5 σ away from the mean 170)
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Normal Distribution in
Discrete Populations
-3 -2 -
 + +2 +3
 Often times we use the normal dist. – a continuous dist. – even for discrete
outcomes. Is this kosher…?
 Say we want to find out the probability of a Rowan student taking 14 credits or
less. We can reasonably assume that the distribution is normal, with mean say 12
(average number of credits taken on campus) and a std. dev. of 2.5
 We may be tempted to standardize this by (14-12)/2.5…but wait…Gaussian is a
continuous distribution, whereas the outcomes of this experiment are always
discrete (unless you are taking 12.238423 credits!). But…let’s give it a try:
• There are two ways to compute this, one more precise then the other…
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Continuity Correction
-3 -2 -
 + +2 +3
12.5
P(X*<14)=P[Z<(14-12)/2.5]
=P(Z<0.6)
= 0.7257
(Gaussian estimated
probability)
14
Continuity Correction
The true probability
P(X*<14.5)=0.881
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Normal Approximation
to Binomial Dist.
-3 -2 -
 + +2 +3
 We note that the previous example is in fact a binomial distribution:
 For a population of n students, we describe the Bernoulli experiment as the
number of courses each student take and define the r.v. X= the number of
students taking 14 credits.
 In a similar thought process, the number of credits taken by any student can be
defined as the number of successes and hence modeled as a binomial variable
 The exact calculation for computing the P(X<14) was in fact computed using the
binomial distribution – where the probability of success is 0.5.
 Of course one can argue that as we move away from mean, p may not be 0.5. After
all, probability of any student taking 12, 13 credits may be 0.5, but taking 18 credit
would certainly not be.
In any case, however, if n.p>5 (better yet if n.p>10) the
binomial distribution becomes mostly symmetric, and can be
approximated by the normal distribution with μ=np and σ=npq
bin(n,p)≈N(np, npq)
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Example
-3 -2 -
 + +2 +3
 At a particular small college the pass rate of Clinic Consultant is 72%. If
500 students enroll in a semester determine the probability that at least 375
students pass.
 Note that there is no mention of the distribution being normal. However, this is a
binomail distribution with a n.p=500*0.72>>10, hence can be approximated as a
normal distribution. Hence,
  np  500(.72)  360
  npq  500(.72)(.28)  10
 375.5  360 
P( X  375)   
  (1.55) = 0.9394
10


© 2010 All Rights Reserved, Robi Polikar, Rowan University
Probability Plots
-3 -2 -
 + +2 +3
 In most disciplines of science and engineering, it is common practice to
assume that the data comes from a Gaussian (normal) distribution,
which is in part supported by the Central Limit Theorem (we’ll see
later).
 But is the data really Gaussian? How to tell?
 There are many ways, the most rigorous ones involve certain hypothesis tests
(which we will study later)
 A simpler way is to use normal probability plots, which shows how well the
percentiles (or more accurately, the quantiles) of the observed data match the
expected percentiles of the truly normally distributed data.
• Quantiles are points taken at regular intervals from the cumulative distribution
functions of a random variable
© 2010 All Rights Reserved, Robi Polikar, Rowan University
-3 -2 -
 + +2 +3
Probability Plots
in Matlab
Normplot - Normal probability plot
h = normplot(X) displays a normal probability plot of the data in X. For matrix X, normplot displays a line for
each column of X. h is a handle to the plotted lines.
The plot has the sample data displayed with the plot symbol '+'. Superimposed on the plot is a line joining the
first and third quartiles (25th and 75th percentiles) of each column of X (a robust linear fit of the sample order
statistics.) This line is extrapolated out to the ends of the sample to help evaluate the linearity of the data.
The purpose of a normal probability plot is to graphically assess whether the data in X could come from a
normal distribution. If the data are normal the plot will be linear. Other distribution types will introduce
curvature in the plot. normplot uses midpoint probability plotting positions. Use probplot when the data
included censored observations.
If the data does come from a normal distribution, the plot will appear linear. Other probability density functions
will introduce curvature in the plot.
Examples
Generate a normal sample and a normal probability plot of the data.
x = normrnd(10,1,25,1);
normplot(x)
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Probability Plots
-3 -2 -
 + +2 +3
y = rand(100,1);
normplot(y)
x = normrnd(10,1,100,1);
normplot(x)
Normal Probability Plot
0.997
0.99
0.98
0.99
0.98
0.95
0.95
0.90
0.90
0.75
0.75
Probability
Probability
Normal Probability Plot
0.997
0.50
0.50
0.25
0.25
0.10
0.10
0.05
0.05
0.02
0.01
0.02
0.01
0.003
0.003
7
8
9
10
Data
11
12
13
0
0.1
0.2
0.3
0.4
0.5
Data
0.6
0.7
0.8
0.9
1
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Other Continuous
(less worthy) Distributions
-3 -2 -
 + +2 +3
 The Normal distribution is a symmetric distribution. Not all phenomena lend
itself to symmetric distributions. For such phenomena that generate skewed
data, there are non-symmetric distributions as well.
 These include
 Gamma distribution
 Exponential distribution
 Chi-squared distribution  coming soon to this classroom!!!
 Beta distribution
 Log-normal distribution
 Weibull distribution
© 2010 All Rights Reserved, Robi Polikar, Rowan University
Homework
-3 -2 -
 + +2 +3
 Problems from Section 4.5
 8, 14, 20, 24
 Problems from Section 4.10
 2, 4, 8
© 2010 All Rights Reserved, Robi Polikar, Rowan University