Chapter 4: Elements of Statistics

advertisement
Chapter 4: Elements of Statistics
4-1
Introduction
The Sampling Problem
Unbiased Estimators
4-2&3 Sampling Theory --The Sample Mean and Variance
Sampling Theorem
4-4
Sampling Distributions and Confidence Intervals
Student’s T-Distribution
4-5
Hypothesis Testing
4-6
Curve Fitting and Linear Regression
4-7
Correlation Between Two Sets of Data
Concepts

Sample means and sample variance relation to pdf mean and variance

Biased estimates of means and variances

How close are the sample values to the underlying pdf values ?

Practical curve fitting, using an NTC resistor to measure temperature.
Statistics Definition: The science of assembling, classifying, tabulating, and analyzing data or
facts:
Descriptive statistics – the collecting, grouping and presenting data in a way that can be easily
understood or assimilated.
Inductive statistics or statistical inference – use data to draw conclusions about or estimate
parameters of, the environment from which the data came from.
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
1 of 25
ECE 3800
Sampling Theory – The Sample Mean
1
Xˆ 
n
Sample Mean
n
 Xi ,
where X i are random variables with a pdf.
i 1
 
1
E Xˆ 
n
n
i 1
1
EX i  
n
n
X  X
i 1
Variance of the sample mean

 
   2   2
1
 n   X 2   X 2  X
Var Xˆ    X 2 

n
n2
n

 
n
where  2 is the true variance of the random variable, X.
Destructive testing or sampling without replacement in a finite population results in another
expression:

 2  N n

Var Xˆ 

n  N 1 
Sampling Theory – The Sample Variance
 X  Xˆ 
n 1
E S  

n
1
S 
n
2
n
2
i
i 1
2
2
where  is the true variance of the random variable.
To create an unbiased estimator, scale by the biasing factor to compute:
2
n
n 1 n
1 n
~
S2 
S2 
  X i  Xˆ 
X i  Xˆ

n 1
n  1 n i 1
n  1 i 1




2
n
~
S2 
S2
n 1
When the population is not large, the biased estimate becomes
N n 1 2


E S2 
N 1 n
and removing the bias results in
n
N
~
ES2 

E S2
N 1 n 1
 
 
 
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
2 of 25
ECE 3800
Additional notes: MATLAB and MS Excel
Simulation and statistical software packages allow for either biased or unbiased computations. In
MS Excel there are two distinct functions stdev and stdevp.


stdev uses (n-1) - http://office.microsoft.com/en-us/excel-help/stdev-function-HP010335660.aspx
stdevp uses (n) - http://office.microsoft.com/en-us/excel-help/stdevp-HP005209281.aspx
In MATLAB, there is an additional flag associate with the std function.
1 n
2
  x j    , flag implied as 0
n  1 j 1
std  X   var X  
std  X ,1  var X ,1 
1 n
2
  x j    , flag specified as 1
n j 1
Variance of the variance
As before, the variance of the variance can be computed. It is defined as
 
  4
Var S 2  4
n
where  4 is the fourth central moment of the population and is defined by

 4  E  X  X

4 
Another proof for extra credit …
For the unbiased variance, the result is
 
~
Var S 2 

   4 n  4   4
 4

n
n  12
n  12
n2

Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
3 of 25
ECE 3800
4-4
Sampling Distribution and Confidence Intervals
Now that we have developed sample values, what are they good for …
What is the probability that our estimates are within specified bounds … by measuring samples,
can you prove that what you built or did is what was specified or promised?
When in doubt … assume Gaussian. Then, the normalized random variable becomes
(the sample mean with the mean removed, divided by the variance of the sample mean)
Z
Xˆ  X

n
If the true population mean is not known, it can be replaced by the sample variance,
T
Xˆ  X
Xˆ  X
 ~
S
S
n 1
n
but this is actually a different distribution defined as a Student’s t distribution with n-1 degrees
of freedom.
The Student’s t probability density function (letting v  n  1 , the degrees of freedom) is defined
as
f T t  
where 
v 1
 v  1

 
2  2
 2   1  t 
v 
v 
v      
 2
 is the gamma function.
The gamma function can be computed as
k  1  k  k 
 k!
and
 2 
1
for any k
for k an integer

Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
4 of 25
ECE 3800
http://en.wikipedia.org/wiki/Student's_t-distribution
Student's distribution arises when (as in nearly all practical statistical work) the population
standard deviation is unknown and has to be estimated from the data.
Textbook problems treating the standard deviation as if it were known are of two kinds:
(1)
those in which the sample size is so large that one may treat a data-based estimate
of the variance as if it were certain, and
(2)
those that illustrate mathematical reasoning, in which the problem of estimating
the standard deviation is temporarily ignored because that is not the point that the
author or instructor is then explaining.
Note that: The distribution depends on ν, but not μ or σ; the lack of dependence on μ and σ is
what makes the t-distribution important in both theory and practice.
Comparing the density functions: Student’s t and Gaussian
Students t and Gaussian Densities
0.4
Gaussian
T w/ v=1
T w/ v=2
T w/ v=8
0.35
density function
0.3
0.25
0.2
0.15
0.1
0.05
0
-4
-3
-2
-1
0
1
2
3
4
See Fig_4_2.m and function students_t.m
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
5 of 25
ECE 3800
Confidence Intervals and the Gaussian and t distributions
The sample mean is a point-estimate (assigns a single value).
An alternative to a point-estimate is an interval-estimate where the parameter being estimated is
declared to lie within a certain interval with a certain probability. The interval estimate is the
confidence interval.
We can then define a q% confidence interval as the interval in which the estimate will lie with a
probability of q/100. The limits of the interval are defined as the confidence limits and q is also
defined to be the confidence level.
Thus we are interested in
X
k 
k 
 Xˆ  X 
n
n
X
or
k 
n
 Xˆ
where k is a constant defined as (notice that it multiplies the “measurement’s standard
deviation”). And the confidence interval defined by
X  k 
q %  100 
 f x   dx  F X  k     F X  k   
X  k 
Xˆ
Xˆ
Xˆ
or

q%  100 
 f x   dx  1  F X  k   
X  k 
ˆ
X
ˆ
X
When the sample size is sufficient to meet the Central Limit Theorem, a Gaussian normal
distribution can be used.
Xˆ  X
Zc 

n
q   z c     z c 
for  z c  z  z c
or
q  z c 
for z c  z
Gaussian PDF


 v X 2 
  dv
FX  x  
 exp


2
2  
 2 

v  
x

1
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
6 of 25
ECE 3800
To find the confidence intervals:
Two Tail Bounds
Confidence Interval (in %)
k or z c :  z c  z  z c 
0.005% to 99.995%
99.99%
3.8906
0.05% to 99.95%
99.9%
3.2905
0.5% to 99.5%
99%
2.5758
2.5% to 97.5%
95%
1.9600
5% to 95%
90%
1.6449
10% to 90%
80%
1.2816
25% to 75%
50%
0.6745
(1) Determine the percentage value required for the bound (e.g. 75% for a 50% 2-sided interval)
(2) Find that value in the Normal table (unit variance).
The value of k or z c is just the row plus column value that would create the probability!
Xˆ  X
Z

n
q   z c     z c 
for  z c  z  z c
q  z c 
for z c  z
Gaussian q values
0.4
0.35
q= 50.00%, k=0.674
0.3
f(x) in dB
0.25
0.2
0.15
q= 90.00%, k=1.645
0.1
q= 95.00%, k=1.960
0.05
q= 99.00%, k=2.576
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
see Fig_4_6.m
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
7 of 25
ECE 3800
HW 4-4.2 A very large population of bipolar transistors has a current gain with a mean value of
120 and a standard deviation of 10. The value of current gain may be assumed to be independent
Gaussian random variables.
a) Find the confidence limits for a confidence level of 90% on the sample mean if it is computed
from a sample size of 150.


X k
 Xˆ  X  k 
n
n
Two sided test at 90% means that k = 1.645.
k

n
 1.645 
10
150
 1.343
120  1.343  Xˆ  120  1.343
b) Repeat part (a) if the sample size is 21.
Two sided test at 90% means that k = 1.645.
k

n
 1.645 
10
21
 3.590
120  3.590  Xˆ  120  3.590
A noticeable concern with “confidence level”:
As the confidence level increases toward 1.0, the range of allowable/acceptable values is
increasing.
Caution: all that can be stated is that the measured value is inside or outside a desired confidence
interval or level.
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
8 of 25
ECE 3800
HW 4-4.3 Repeat Problem 4-4.2 for a one-sided confidence interval. Restating the problem …
Find the value of the current gain above which 90% of the sample means would lie.

X k
 Xˆ
n
(a) 150 sample size
One sided test at 90% means that
  
  
 k 
  0.9 or Q k 
  1  0.9
n
n


Therefore, k = 1.2826 and
k

n
 1.2826 
X k

n
10
150
 1.047
 118.95  Xˆ
(b) 21 sample size
One sided test at 90% means, k = 1.2826 and
k

n
 1.2826 
X k

n
10
21
 2.799
 117.20  Xˆ
One Tail Bounds
Confidence Interval (in %)
k or z c :  z c  z  z c 
99.99%
99.99%
3.7190
99.9%
99.9%
3.0902
99%
99%
2.3263
95%
95%
1.6449
90%
90%
1.2816
80%
80%
0.8416
75%
75%
0.6745
50%
50%
0
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
9 of 25
ECE 3800
If the sample size is not sufficient or if the “probabilistic variance” is unknown, the Student
t-distribution must be used.
Appendix F provides tables of t for given v and F based on:
v 1
 v  1

 
2  2
 2   1  x 
FT t  
v 
v 
x   v      
 2
t

Using the estimated sample mean and the variance of the sample mean:
X
tS
n 1
~
tS
X
n
tS
 Xˆ  X 
n 1
~
tS
ˆ
XX
n
X
or
tS
 Xˆ
n 1
~
tS
X
 Xˆ
n
or
where
Xˆ  X
Xˆ  X
t
 ~
S
S
n 1
n
tc
q  100 
 fT t   dt  FT tc   FT  tc 
for  t c  t  t c , 2-sided
tc
or
tc
q  100 
 fT t   dt  FT tc 
for t c  t , “right-tail”

Student’s t PDF
v 1
 v  1

 
2  2
 2   1  x 
FT t  
v 
v 
x   v      
 2
t

Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
10 of 25
ECE 3800
Reading the T-Distribution tables
On-line tables available at http://www.statsoft.com/textbook/sttable.html
The degrees of freedom left column, v  n  1 where n is the sample size
Rows are in “level of significance” or 1-percentage. Table F is in percentage.
The value of t c for different degrees of freedom, v  n  1 ,
%con int
when the area for FT t c  
100
L of S
C Int
1
2
3
4
5
0.40
0.6
0.324920
0.288675
0.276671
0.270722
0.267181
0.25
0.75
1.000000
0.816497
0.764892
0.740697
0.726687
0.10
0.90
3.077684
1.885618
1.637744
1.533206
1.475884
0.05
0.95
6.313752
2.919986
2.353363
2.131847
2.015048
0.025
0.975
12.70620
4.30265
3.18245
2.77645
2.57058
0.01
0.99
31.82052
6.96456
4.54070
3.74695
3.36493
0.005
0.995
63.65674
9.92484
5.84091
4.60409
4.03214
0.0005
0.9995
636.6192
31.5991
12.9240
8.6103
6.8688
6
7
8
9
10
0.264835
0.263167
0.261921
0.260955
0.260185
0.717558
0.711142
0.706387
0.702722
0.699812
1.439756
1.414924
1.396815
1.383029
1.372184
1.943180
1.894579
1.859548
1.833113
1.812461
2.44691
2.36462
2.30600
2.26216
2.22814
3.14267
2.99795
2.89646
2.82144
2.76377
3.70743
3.49948
3.35539
3.24984
3.16927
5.9588
5.4079
5.0413
4.7809
4.5869
11
12
13
14
15
0.259556
0.259033
0.258591
0.258213
0.257885
0.697445
0.695483
0.693829
0.692417
0.691197
1.363430
1.356217
1.350171
1.345030
1.340606
1.795885
1.782288
1.770933
1.761310
1.753050
2.20099
2.17881
2.16037
2.14479
2.13145
2.71808
2.68100
2.65031
2.62449
2.60248
3.10581
3.05454
3.01228
2.97684
2.94671
4.4370
4.3178
4.2208
4.1405
4.0728
16
17
18
19
20
0.257599
0.257347
0.257123
0.256923
0.256743
0.690132
0.689195
0.688364
0.687621
0.686954
1.336757
1.333379
1.330391
1.327728
1.325341
1.745884
1.739607
1.734064
1.729133
1.724718
2.11991
2.10982
2.10092
2.09302
2.08596
2.58349
2.56693
2.55238
2.53948
2.52798
2.92078
2.89823
2.87844
2.86093
2.84534
4.0150
3.9651
3.9216
3.8834
3.8495
21
22
23
24
25
0.256580
0.256432
0.256297
0.256173
0.256060
0.686352
0.685805
0.685306
0.684850
0.684430
1.323188
1.321237
1.319460
1.317836
1.316345
1.720743
1.717144
1.713872
1.710882
1.708141
2.07961
2.07387
2.06866
2.06390
2.05954
2.51765
2.50832
2.49987
2.49216
2.48511
2.83136
2.81876
2.80734
2.79694
2.78744
3.8193
3.7921
3.7676
3.7454
3.7251
26
27
28
29
30
0.255955
0.255858
0.255768
0.255684
0.255605
0.684043
0.683685
0.683353
0.683044
0.682756
1.314972
1.313703
1.312527
1.311434
1.310415
1.705618
1.703288
1.701131
1.699127
1.697261
2.05553
2.05183
2.04841
2.04523
2.04227
2.47863
2.47266
2.46714
2.46202
2.45726
2.77871
2.77068
2.76326
2.75639
2.75000
3.7066
3.6896
3.6739
3.6594
3.6460
inf
0.253347
0.674490
1.281552
1.644854
1.95996
2.32635
2.57583
3.2905
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
11 of 25
ECE 3800
Examples of use:
Exercise 4-4.2
A very large population of resistor values has a true mean of 100 ohms and a sample standard
deviation of 4 ohms. Find the confidence interval on the sample mean for a confidence level of
95% if it is computed from:
a) a sample size of 100.
v = 99
Using v=60 (no 100 given) and F=0.975 (2 sided test) on p. 436, t=2.00.
Therefore
~
~
X t S
 Xˆ  X  t  S
n
n
~
4
tS
 2.00 
 0.8
n
100
100  0.8  Xˆ  100  0.8
99.2  Xˆ  100.8
Using v=120 (no 100 given) and F=0.975 (2 sided test) on p. 436, t=1.98.
~
4
tS
 1.98 
 0.792
n
100
99.208  Xˆ  100.792
b) a sample size of 9.
v=8
Using v=8 and F=0.975 (2 sided test) on p. 436, t=2.306.
Therefore
~
~
X t S
 Xˆ  X  t  S
n
n
~
4
tS
 2.306 
 3.075
n
9
100  3.075  Xˆ  100  3.075
These answers differ from the text based on using two-sided (97.5) vs. single sided (95.0).
If you use 1- sided: t= 1.86 F=0.95 (90% 2-sided interval), then one of the textbook solution can
be recognized!
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
12 of 25
ECE 3800
HW 4-4.2 A very large population of bipolar transistors has a current gain with a mean value of
120 and a standard deviation of 10, The value of current gain may be assumed to be independent
Gaussian random variables.
b) Repeat part (a) if the sample size is 21.
Two sided test at 90% means that k = 1.645.
k

n
 1.645 
10
21
 3.590
120  3.590  Xˆ  120  3.590
If the variance was an estimated variance … instead of a known variance.
v = 20
Using v=20 and F=0.95 (2 sided test) on p. 436, t=1.725.
Therefore
~
~
X t S
 Xˆ  X  t  S
n
n
~
10
tS
 1.725 
 3.764
n
21
100  3.764  Xˆ  100  3.764
Notice that using an estimate variance results in a greater range of values (differences in the
density functions).
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
13 of 25
ECE 3800
Skill 17-2 A cereal vendor’s quality control department has just tested a random sample of 10
“20 ounce” boxes of Oat Flakes by weighing them in order to see if their 20 ounce claim is to be
believed. Their report, to be forwarded to management, must include a 95% confidence interval
as to the population mean.
a) Find the unbiased mean and standard deviation
b) Determine the 95% confidence interval of the mean (by using the Student’s-t table).
c) In general, if the confidence interval becomes tighter (smaller), would the confidence level
increase or decrease?
Measurement Data: 19, 18, 21, 21, 18, 22, 17, 19, 20, and 17.
1
Xˆ 
n
a) Sample Mean
n
 Xi ,
where X i are random variables with a pdf.
i 1
1
192
Xˆ   19  18  21  21  18  22  17  19  20  17 
 19.2
10
10
Unbiased variance
 
 

n
1
~
ES2 
  X i  Xˆ
n  1 i 1

2


1
27.6
~
E S 2   0 .2 2  1 .2 2  1 .8 2  1 .8 2  1 .2 2  2 .8 2  2 .2 2  0 .2 2  1 .8 2  2 .2 2 
 3.067
9
9
27.6
~
S 
 3.067  1.751
9
v=9
Using v=9 and F=0.975 (2 sided test) on p. 436, t=2.262.
Therefore
~
~
X t S
 Xˆ  X  t  S
n
n
~
1.751
tS
 2.262 
 1.252
n
10
19.2  1.252  Xˆ  19.2  1.252
17.948  Xˆ  20.452
(c) As the confidence interval becomes tighter (smaller) [p% going down! ], the confidence
level/interval decreases.
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
14 of 25
ECE 3800
4-5
Hypothesis Testing
Now that we have concepts of acceptable intervals based on collected statistics, we can relate
this to whether a statement based on the statistics is acceptable or not …
Statistical Decision Making.
A statement is made in terms of a Hypothesis. The goal of interpreting the measured values is to
accept or reject the Hypothesis.
Examples:





the coin being flipped is fair
two random noise signals (processes) have the same mean
the lifetime stated on a light bulb is an appropriate description of the mean
the lifetime stated on the light bulb is an appropriate description of a minimum
the sample mean measured for a set of components is within the 95% confidence interval
of the desired mean value (1% resistors are within 1% of value with 95% confidence)
When there is only one Hypothesis, it is referred to as the null Hypothesis (H0).
There are potentially multiple Hypotheses, we can generate criteria to accept one over another
(establishing thresholds for decision making).
Null Hypothesis Testing
A significance test based on a decision rule must be determined.
The significance test establishes a level, potentially the confidence level or confidence interval,
to determine whether to accept or reject the hypothesis. This is stated as a decision rule where
Accept H0:
if the computed value “passes” the significance test.
Reject H0:
if the computed value “fails” the significance test.
In general this involves a significance test that defines and equation or function that can be
computed based on the measured data (statistics). A performance threshold is then defined that
defines a Accept/Reject or pass/fail boundary.


Inside or outside the confidence interval.
Acceptably meet desired criteria or not.
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
15 of 25
ECE 3800
Example: (p. 174) A capacitor manufacturer claims that the capacitors have a mean breakdown
voltage of 300V or greater. We establish a significance test of a 99% confidence level. (In this
case, we are looking for values above the minimum confidence level as acceptable, a one sided
test. )
In testing, 100 capacitors are tested (note this is a destructive test) and a mean value of 290V
with an unbiased sample standard deviation of 40 V.
Is the Hypothesis accepted or rejected?
The significance test at the 99% confidence level requires
~
X t S
 Xˆ
n
Using v=99 and F=0.99 (1 sided test) on p. 436, t=2.358 (using 120).
Therefore
~
40
tS
 2.358 
 9.432
n
100
300  9.432  Xˆ or 290.568  Xˆ
The measurement results cause us to reject the Hypothesis! Therefore, we would say that the
mean claimed is not valid.
Textbook example: the textbook did this for a Gaussian distribution, assuming that 100 is getting
close enough to not worry about the T distribution.
Final note: a 99% confidence level was selected for the significance test in this example, if a
99.5% level were selected; the Hypothesis would have been accepted! (300-10.468=289.532)
To alleviate (?) this confusion, a “level of significance” can be defined that is 1.0 - confidence
level. This would say that the above test was to a 1% level of significance. Therefore a 1% level
of significance is rejected but a 0.5% level of significance would be accepted.
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
16 of 25
ECE 3800
Example: Testing a fair Coin
The binomial random variable allows us to develop a test to see if a coin is fair when flipped. We
need to count the number of heads that occur and test at a level of significance of 5% or a 95%
confidence level.
Accept H0:
if the number of heads inside 95% confidence level, it is fair.
Reject H0:
if the number of heads outside 95% confidence level, it is not fair.
We assume that number of trials has lead to Gaussian statistics, except for prior knowledge of
the mean and variance expected for a fair coin (p=0.5) using the binomial random variable. For
this distribution, the mean and variance are
E X   X  n  p or Var  X    2  n  p  1  p 
Assuming 100 trials and that the statistics have become Gaussian.
k 
k 
X
 Xˆ  X 
n
n
We use a two-sided test, therefore k=1.96 and we have
k   1.96  100  0.5  0.5 1.96  100


 4.9
10  4
n
100
Therefore the test region for the Hypothesis is
50  4.9  Xˆ  50  4.9
or
45.1  Xˆ  54.9
Now you have a criteria to establish if a coin is fair ….
From Matlab:
Pr(46<=x<=54) = 0.631798
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
17 of 25
ECE 3800
Example: Hypothesis testing in communications
Signal plus noise input to the receiver. “Digital symbol” receiver outputs a value corresponding
to a symbol plus noise. Hypothesis testing establishes the rules to select one symbol as compared
to another. Incorrect selections results in symbol errors and digital bit errors once the symbols
are translated into bits.
r t   s i t   nt ,
for   i  T  t    i  1  T
Bernard Sklar, “Digital Communications, Fundamentals and Applications,”
Prentice Hall PTR, Second Edition, 2001. Appendix B.
For each symbol a detection statistic is generated for the symbol period T.
z T   ai T   n0 T 
Hypothesis testing then determines the estimated symbol value from the detection statistic.
The number of Hypothesis is equivalent to the number of possible symbols transmitted.
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
18 of 25
ECE 3800
4-6
Curve Fitting and Linear Regression
Fitting lines to scatter plots.
Data provided as (x,y) pairs. Is there a function that goes through all the points? Yes …
If you want to use a polynomial of degree n-1 for n pairs! But we usually want simple curves to
represent the data, like lines or parabolas, etc. where
y  a  bx or y  a  bx  cx 2
To fit the curve we want to minimize the following function of the polynomial
(thus minimizing the squared error):
 y  a  b  x  c  x
n
i
i
i
2
2

i 1
For a linear regression (a line), we have
n
err    yi  a  b  xi 
2
i 1
To minimize for the values a and b, take the derivatives and set them equal to zero. Then solve
for a and b:
d err 

da
n
 2  yi  a  b  xi   0
i 1
n
y
i 1
d err 

db
n
i
 n  a  b   xi
i 1
n
 2  yi  a  b  xi  xi  0
i 1
n
n
n
i 1
i 1
i 1
 y i  xi  a   xi  b   xi
2
Solving for the minimum (from d/da)
a
n
1  n

   y i  b   xi 
n  i 1
i 1

and (substituting a into d/db)
n

 n   n
n  y i  xi     xi     y i 
 i 1   i 1 
b  i 1
2
n
 n 
2
n  xi    xi 
i 1
 i 1 
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
19 of 25
ECE 3800
Proof:
Working with d/da
n
n
n
 y   a  b  x   n  a  b   x
i
i 1
a
i
i 1
n
n
i 1
i 1
 y i  b xi
n

i 1
i
n
1 n

  y i  b  x i 
n  i 1
i 1

Substituting a into d/db and solve for b
n
n
n
 y i  xi  a   xi  b   xi
i 1
n
y
i 1
i
 xi 
i 1
2
i 1
n
1 

2
  y i  b  x i    x i  b   x i
n  i 1
i 1
i 1
 i 1
n
n
n
2
n
n
 n 2 1  n  2 
1
1  n 
2
y i  xi    y i   xi  b     xi   b   xi  b   xi     xi  

n i 1
n  i 1 
n  i 1  
i 1
i 1
i 1
 i 1
n
n
n
b
 y i  xi 
i 1
n
x
i 1
2
i
n
1 n
  y i   xi
n i 1
i 1

1 

   xi 
n  i 1 
n
2
n
n
n
n   y i  xi   y i   xi

i 1
i 1
i 1
2


2
n   xi    xi 
i 1
 i 1 
n
n
and using b in the d/da equation solution
n
n
n




n
y
x
y
x







i
i
i
i
n
n
1 n
 1 n

i 1
i 1
i 1
a    y i  b  x i     y i 
  xi 
2
n
n  i 1
i 1
i 1
 n  i 1
 n 
2

n
x


  xi 

i


i 1
 i 1 


2
n
n
1 n
1 n
 n 
 n 
y i   xi    y i    xi    y i  xi   xi    y i    xi 

n i 1
n i 1
i 1
i 1
i 1
 i 1 
 i 1 
a  i 1
2
n
 n 
2
n   xi    xi 
i 1
 i 1 
n
n
2
2
a
n
n
i 1
i 1
n
n
 y i   xi   xi   y i  xi
2
i 1
i 1


n   xi    xi 
i 1
 i 1 
n
2
n
2
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
20 of 25
ECE 3800
Alternate formulation (using the statistical mean of x and y)
  

1 n
  y i  xi  Yˆ  Xˆ
n i 1
b
2
1 n 2
  xi  Xˆ
n i 1
Yˆ  1n   x  Xˆ  1n   y  x
a
1
  x  Xˆ 
n
n
n
2
i
i 1
i 1
n
i
i
2
2
i
i 1
Also to form estimates of the correlation and covariance:
R XY 
S X2 
  
1 n
  y i  xi
n i 1
C XY  R XY  Yˆ  Xˆ

1 n 2
  xi  Xˆ
n i 1
2
SY2 
2
C XY

S X2  SY2
b
a

1 n 2 ˆ
  yi  Y
n i 1
C XY
S X2
Yˆ  S  Xˆ  R
2
X
S
XY
2
X
See HW_4_6_1.m
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
21 of 25
ECE 3800
Correlation of a discrete random variables
If we assume that every x is equally likely, the pmf of the functions has the same value for x, 1/n.
Repeated pairs simply sum the probability at the point. So,
Mean or 1st Moment
2nd Moment


   x
 x  f x, y   dx
E X  
EX
EX  

    x    x n x   dx
   x  xi  
x

  dx
n


i 1
n
 

1
 X  E X   
n
 f  x, y   dx
2



2
2
EX
2
 
 xi
EX
2
i 1
i
i 1

n
n
1
 R XX  
n
n
 xi 2
i 1
2nd Central Moment

E X  X


    x  X 2  f x, y   dx



   x  xi  

x  X  
  dx

n



2
i 1

2
1
E X  X   

 n

 X 2  C XX  E  X  X


i 1
n

 xi  X 

n
2
n
2
1
E X  X   

 n



2
E X  X  


2
1
E X  X   

 n

2
n

i 1
n
 xi

1

n
  xi 2  2  xi  X  X
2
n

i 1
2
 2 X  X
i 1
1
 
 n
2
1
 X 2  C XX  
n


i 1
2
xi   X 
n
2
2
n
 xi 2  X
i 1
2
2
1
xi  
n
1
 
n
1
 
n
n
X
2
i 1
n
 xi 2  X
2
i 1
n


2 1
xi 
xi 

n

i 1
 i 1 
n


2
2
n


2 1
xi 
xi   R XX   X 2

n

i 1
 i 1 
n


Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
22 of 25
ECE 3800
Correlation between discrete random variables
For two sequences or paired groupings (x,y).
If we assume that every (x,y) pair is equally likely, the pmf of the functions has the same value
for every pair, 1/n. Repeated pairs simply sum the probability at the point. So, for correlation,
EX  Y  
 
  x  y  f x, y   dx  dy
 
for  xi , yi  and f xi , yi   1 for all pairs, i  1to n
n
 
n

E X  Y  
x y
 
  x  xi     y  y i  
i 1
 
  dx  dy

n
R XY  E X  Y  
1

n
n
 xi  y i
i 1
Defining the cross correlation

 
   x  X  y  Y  f x, y   dx  dy

E X  X  Y Y 
 

 
   x  X  y  Y    x  xi n   y  yi   dx  dy


i 1

n
E X  X  Y Y 
 



1
E X  X  Y Y  
n



E X  X  Y Y 



1
E X  X  Y Y  
n

1

n
n
 xi  X  yi  Y 
i 1
n
 xi  yi  xi  Y  yi  X  X  Y 
i 1
n
  xi  y i   n  n  X  Y  n  n  X  Y  n  n  X  Y
1
1
1
i 1


1
C XY  E X  X  Y  Y  
n
n
  xi  y i    X  Y
i 1
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
23 of 25
ECE 3800
The Discrete Correlation coefficient
For two sequences or paired groupings (x,y).
If we assume that every (x,y) pair is equally likely, the pmf of the functions has the same value
for every pair, 1/n. Repeated pairs simply sum the probability at the point. So,



 



 X  X Y Y 
x  X y Y
E


 f  x, y   dx  dy

Y 
X
Y
 X
 








 

 X  X Y Y 
x  X y Y
   x  xi     y  y i  
E





  dx  dy
n
Y 
X
Y


 X
i 1
 


n

n


 X  X Y Y  1
xi  X y i  Y

r  E

 
n



Y
X
Y
X


i 1





n
 X  X Y Y  1
1
E


xi  y i  x i  Y  y i  X  X  Y
 
n





X
Y
X
Y


i 1




 X  X Y Y 
1
E


 Y   X  Y
 X

1 n

1
1
1

xi  yi    n  X  Y   n  X  Y   n  X  Y 
 
n
n
n
 n i 1




 X  X Y Y 

r   XY  E 



X
Y






 X  X Y Y 

r   XY  E 

Y 
 X
1

n
n
  xi  y i    X  Y
i 1
 X  Y

C XY
 X  Y
1 n
1 n  1 n 
   xi  y i      xi      y i 
n i 1
 n i 1   n i 1 
2
1 n 2 1 n 
1 n 2 1 n

  xi     xi  
  yi     yi 
n i 1
n i 1
 n i 1 
 n i 1 
2
The text defines this as Pearson’s r, the linear correlation coefficient between two sets of data!
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
24 of 25
ECE 3800
Based on the discrete terms, linear estimation becomes
Then,
a
Yˆ  R
ˆ

ˆ 
ˆ
XX  X  R XY  Y  R XX  X  R XY
2
C XX
R XX  Xˆ
and
b
    C
C
 Xˆ 
R XY  Yˆ  Xˆ
R XX
2
XY
XX
Pavlovian conditioning for sampled data … compute
x: Mean, 2nd moment, variance (  X , R XX , and  X )
y: Mean, 2nd moment, variance (  Y , RYY , and  Y )
x and y: R XY , C XY , and  XY
 xi
 
 xi 2
E X 2  R XX 
1
 X 2  C XX  
n
1

n
i 1
n
i 1
2
n


2 1
xi 
xi   R XX   X 2

n

i 1
 i 1 
n


1
R XY  E X  Y   
n

n
1
 X  E X   
n


C XY  E X  X  Y  Y 
 XY 
n
 xi  y i
1

n
i 1
n
  xi  y i    X  Y
i 1
C XY
 X  Y
Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System
Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9.
B.J. Bazuin, Spring 2015
25 of 25
ECE 3800
Download