Ch0

advertisement
PowerPoint to accompany
Introduction to MATLAB 7
for Engineers
William J. Palm III
Chapter 7
Probability and Statistics
Copyright © 2005. The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agenda






Histograms – distribution of data
Probability
Expected Value and Variance
Random number generation
Normal Distribution Curve and its meaning
Intro to Monte Carlo Methods
An example of a histogram: test scores for 20 students.
Figure 7.1–1
number in each bin
Bins
7-2
"Absolute Frequency" Histogram Plot
-- shows number of outcomes in ea. bin
% Thread breaking strength data for 20 tests.
y =
[92,94,93,96,93,94,95,96,91,93,95,95,95,92,93,94,
91,94,92,93];
% six possible outcomes = 91,92,93,94,95,96.
x = [91:96];
hist(y,x),axis([90 97 0 6])
ylabel('Frequency')
xlabel('Thread Strength (N)')
title('Absolute Frequency Histogram for 20
Tests')
This creates the next figure.
7-3
Histograms for 20 tests of thread strength. Figure 7.1–2
7-4
Absolute frequency histogram for 100 thread tests.
Figure 7.1–3. This was created by the program on page 421.
7-5
Variations of hist()



hist(y) plots data vector y in histogram
with 10 bins
hist(y,n) plots data vector y in histogram
with n bins
hist(y,x) plots data vector y in histogram
with bins determined by x vector.
hist( ) can also just sort your data into bins
For further processing
tests = 100;
y = [91*ones(1,13),92*ones(1,15),93*ones(1,22),
94*ones(1,19),95*ones(1,17),96*ones(1,14)];
x = [91:96];
sort data into bins
[z,x] = hist(y,x);
bar(x,z/tests)
ylabel('Relative Frequency')
xlabel('Thread Strength (N)')
title('Relative Frequency Histogram for 100
Tests')
This also creates the previous figure.
7-8
Histogram functions Table 7.1–1
7-9
Command
Description
bar(x,y)
Creates a bar chart of y versus x.
hist(y)
Aggregates the data in the vector y into 10 bins evenly
spaced between the minimum and maximum values in y.
hist(y,n)
Aggregates the data in the vector y into n bins evenly
spaced between the minimum and maximum values in y.
hist(y,x)
Aggregates the data in the vector y into bins whose center
locations are specified by the vector x. The bin widths are
the distances between the centers.
[z,x] = hist(y)
Same as hist(y) but returns two vectors z and x that
contain the frequency count and the bin locations.
[z,x] = hist(y,n)
Same as hist(y,n) but returns two vectors z and x that
contain the frequency count and the bin locations.
[z,x] = hist(y,x)
Same as hist(y,x) but returns two vectors z and x that
contain the frequency count and the bin locations. The
returned vector x is the same as the user-supplied
vector x.
Normal Distribution Curve
The basic shape of the normal distribution curve.
Figure 7.2–3
7-16
More? See pages 429 – 431.
The effect on the normal distribution curve of increasing σ.
For this case μ = 10, and the three curves correspond to
σ = 1, σ = 2, and σ = 3. Figure 7.2–4
7-17
Probability interpretation of the μ ± σ limits. Figure 7.2–5
7-18
Probability interpretation of the μ ± 2σ limits.
Figure 7.2–5 (continued)
7-19
More? See pages 431-432.
Error Function
The probability that the random variable x is no less
than a and no greater than b is written as P(a  x  b). It
can be computed as follows:
P(a  x  b) =
erf b - m
s 2
2
1

a-m
- erf
s 2
(7.2-5)
in matlab the function erf() can be used for
the above calculation
More? See pages 434 -435.
Random Number Generation

Two basic PDFs are used throughout
technical computing world:

Normal -- gaussian "bell curve" with mean
and standard deviation as discussed

Uniform – flat distribution, all values equally
likely to occur
Sums and Differences of Random Variables
It can be proved that the mean of the sum (or difference) of
two independent normally 2distributed
random variables
2
equals the sum (or difference) of their means, but the
variance is always the sum of the two variances. That is, if x
and y are normally distributed with means mx and my, and
variances s x and s y, and if u = x + y and u = x - y, then
mu = mx + my
(7.2–6)
mu = mx - my
(7.2–7)
s
7-21
2
u=
2
2
2
su=sx+sy
(7.2–8)
Random number functions Table 7.3–1
Command
Description
rand
Generates a single uniformly distributed random number
between 0 and 1.
rand(n)
Generates an n  n matrix containing uniformly distributed
random numbers between 0 and 1.
rand(m,n)
Generates an m  n matrix containing uniformly distributed
random numbers between 0 and 1.
s = rand(’state’)
Returns a 35-element vector s containing the current state
of the uniformly distributed generator.
rand(’state’,s)
Sets the state of the uniformly distributed generator to s.
rand(’state’,0)
Resets the uniformly distributed generator to its initial
state.
rand(’state’,j)
Resets the uniformly distributed generator to state j, for
integer j.
rand(’state’,sum(100*clock)) Resets the uniformly distributed generator to a different
state each time it is executed.
7-22
Table 7.3–1 (continued)
randn
Generates a single normally distributed random number
having a mean of 0 and a standard deviation of 1.
randn(n)
Generates an n  n matrix containing normally
distributed random numbers having a mean of 0 and a
standard deviation of 1.
randn(m,n)
Generates an m  n matrix containing normally
distributed random numbers having a mean of 0 and a
standard deviation of 1.
Like rand(’state’) but for the normally distributed
generator.
Like rand(’state’,s) but for the normally distributed
generator.
Like rand(’state’,0) but for the normally distributed
generator.
Like rand(’state’,j) but for the normally distributed
generator.
Like rand(’state’,sum(100*clock)) but for the
normally distributed generator.
Generates a random permutation of the integers from 1
to n.
s = randn(’state’)
randn(’state’,s)
randn(’state’,0)
randn(’state’,j)
randn(’state’,sum(100*clock))
randperm(n)
7-23
Repeating Randomness (useful for simulation)
The following session shows how to obtain the same sequence
every time rand is called.
>>rand('state',0)
>>rand
ans =
0.9501
>>rand
ans =
0.2311
>>rand('state',0)
>>rand
ans =
0.9501
>>rand
7-24
ans =
0.2311
You need not start with the initial state in order to
generate the same sequence. To show this, continue
the above session as follows.
>>s = rand('state');
>>rand('state',s)
>>rand
ans =
0.6068
>>rand('state',s)
>>rand
ans =
0.6068
7-25
Generating a Uniform Distribution
The general formula for generating a uniformly distributed
random number y in the interval [a, b] is
y = (b - a) x + a
(7.3–1)
where x is a random number uniformly distributed in the
interval [0, 1]. For example, to generate a vector y
containing 1000 uniformly distributed random numbers in
the interval [2, 10], you type y = 8*rand(1,1000) + 2.
7-26
Generating a Normal Distribution
If x is a random number with a mean of 0 and a
standard deviation of 1, use the following equation to
generate a new random number y having a standard
deviation of s and a mean of m.
y=sx+m
(7.3–2)
For example, to generate a vector y containing 2000
random numbers normally distributed with a mean of 5
and a standard deviation of 3, you type
y = 3*randn(1,2000) + 5.
7-28
Statistical analysis and manufacturing tolerances: Example
7.3-2. Dimensions of a triangular cut. Figure 7.3–2
7-30
Scaled histogram of the angle q. Figure 7.3–3
7-31
More? See pages 443 – 444.
Grading on the (Bell) curve
What is it? Here’s the way I understand the “bell curve”: make the
mean a C, then the mean plus/minus a half standard deviation
would be the C-/C/C+ scores, one more standard deviation out
would give the B’s and D’s, and the tails would give the A’s and F’s.
This could be tweaked in any number of ways—change the mean,
fatten or slim the distribution.
I don’t know if this is used by any professors anymore (in small
classes, at least).
Pros: grades end up with a very predictable distribution
Cons: ruthless, students competing against classmates
Use when: for standardized tests in which only a certain number of
students can pass, for large classes or multiple sections when there
must be a fixed distribution
Download