Chapter 4: Elements of Statistics 4-1 Introduction The Sampling Problem Unbiased Estimators 4-2&3 Sampling Theory --The Sample Mean and Variance Sampling Theorem 4-4 Sampling Distributions and Confidence Intervals Student’s T-Distribution 4-5 Hypothesis Testing 4-6 Curve Fitting and Linear Regression 4-7 Correlation Between Two Sets of Data Concepts Sample means and sample variance relation to pdf mean and variance Biased estimates of means and variances How close are the sample values to the underlying pdf values ? Practical curve fitting, using an NTC resistor to measure temperature. Statistics Definition: The science of assembling, classifying, tabulating, and analyzing data or facts: Descriptive statistics – the collecting, grouping and presenting data in a way that can be easily understood or assimilated. Inductive statistics or statistical inference – use data to draw conclusions about or estimate parameters of, the environment from which the data came from. Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 1 of 25 ECE 3800 Sampling Theory – The Sample Mean 1 Xˆ n Sample Mean n Xi , where X i are random variables with a pdf. i 1 1 E Xˆ n n i 1 1 EX i n n X X i 1 Variance of the sample mean 2 2 1 n X 2 X 2 X Var Xˆ X 2 n n2 n n where 2 is the true variance of the random variable, X. Destructive testing or sampling without replacement in a finite population results in another expression: 2 N n Var Xˆ n N 1 Sampling Theory – The Sample Variance X Xˆ n 1 E S n 1 S n 2 n 2 i i 1 2 2 where is the true variance of the random variable. To create an unbiased estimator, scale by the biasing factor to compute: 2 n n 1 n 1 n ~ S2 S2 X i Xˆ X i Xˆ n 1 n 1 n i 1 n 1 i 1 2 n ~ S2 S2 n 1 When the population is not large, the biased estimate becomes N n 1 2 E S2 N 1 n and removing the bias results in n N ~ ES2 E S2 N 1 n 1 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 2 of 25 ECE 3800 Additional notes: MATLAB and MS Excel Simulation and statistical software packages allow for either biased or unbiased computations. In MS Excel there are two distinct functions stdev and stdevp. stdev uses (n-1) - http://office.microsoft.com/en-us/excel-help/stdev-function-HP010335660.aspx stdevp uses (n) - http://office.microsoft.com/en-us/excel-help/stdevp-HP005209281.aspx In MATLAB, there is an additional flag associate with the std function. 1 n 2 x j , flag implied as 0 n 1 j 1 std X var X std X ,1 var X ,1 1 n 2 x j , flag specified as 1 n j 1 Variance of the variance As before, the variance of the variance can be computed. It is defined as 4 Var S 2 4 n where 4 is the fourth central moment of the population and is defined by 4 E X X 4 Another proof for extra credit … For the unbiased variance, the result is ~ Var S 2 4 n 4 4 4 n n 12 n 12 n2 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 3 of 25 ECE 3800 4-4 Sampling Distribution and Confidence Intervals Now that we have developed sample values, what are they good for … What is the probability that our estimates are within specified bounds … by measuring samples, can you prove that what you built or did is what was specified or promised? When in doubt … assume Gaussian. Then, the normalized random variable becomes (the sample mean with the mean removed, divided by the variance of the sample mean) Z Xˆ X n If the true population mean is not known, it can be replaced by the sample variance, T Xˆ X Xˆ X ~ S S n 1 n but this is actually a different distribution defined as a Student’s t distribution with n-1 degrees of freedom. The Student’s t probability density function (letting v n 1 , the degrees of freedom) is defined as f T t where v 1 v 1 2 2 2 1 t v v v 2 is the gamma function. The gamma function can be computed as k 1 k k k! and 2 1 for any k for k an integer Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 4 of 25 ECE 3800 http://en.wikipedia.org/wiki/Student's_t-distribution Student's distribution arises when (as in nearly all practical statistical work) the population standard deviation is unknown and has to be estimated from the data. Textbook problems treating the standard deviation as if it were known are of two kinds: (1) those in which the sample size is so large that one may treat a data-based estimate of the variance as if it were certain, and (2) those that illustrate mathematical reasoning, in which the problem of estimating the standard deviation is temporarily ignored because that is not the point that the author or instructor is then explaining. Note that: The distribution depends on ν, but not μ or σ; the lack of dependence on μ and σ is what makes the t-distribution important in both theory and practice. Comparing the density functions: Student’s t and Gaussian Students t and Gaussian Densities 0.4 Gaussian T w/ v=1 T w/ v=2 T w/ v=8 0.35 density function 0.3 0.25 0.2 0.15 0.1 0.05 0 -4 -3 -2 -1 0 1 2 3 4 See Fig_4_2.m and function students_t.m Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 5 of 25 ECE 3800 Confidence Intervals and the Gaussian and t distributions The sample mean is a point-estimate (assigns a single value). An alternative to a point-estimate is an interval-estimate where the parameter being estimated is declared to lie within a certain interval with a certain probability. The interval estimate is the confidence interval. We can then define a q% confidence interval as the interval in which the estimate will lie with a probability of q/100. The limits of the interval are defined as the confidence limits and q is also defined to be the confidence level. Thus we are interested in X k k Xˆ X n n X or k n Xˆ where k is a constant defined as (notice that it multiplies the “measurement’s standard deviation”). And the confidence interval defined by X k q % 100 f x dx F X k F X k X k Xˆ Xˆ Xˆ or q% 100 f x dx 1 F X k X k ˆ X ˆ X When the sample size is sufficient to meet the Central Limit Theorem, a Gaussian normal distribution can be used. Xˆ X Zc n q z c z c for z c z z c or q z c for z c z Gaussian PDF v X 2 dv FX x exp 2 2 2 v x 1 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 6 of 25 ECE 3800 To find the confidence intervals: Two Tail Bounds Confidence Interval (in %) k or z c : z c z z c 0.005% to 99.995% 99.99% 3.8906 0.05% to 99.95% 99.9% 3.2905 0.5% to 99.5% 99% 2.5758 2.5% to 97.5% 95% 1.9600 5% to 95% 90% 1.6449 10% to 90% 80% 1.2816 25% to 75% 50% 0.6745 (1) Determine the percentage value required for the bound (e.g. 75% for a 50% 2-sided interval) (2) Find that value in the Normal table (unit variance). The value of k or z c is just the row plus column value that would create the probability! Xˆ X Z n q z c z c for z c z z c q z c for z c z Gaussian q values 0.4 0.35 q= 50.00%, k=0.674 0.3 f(x) in dB 0.25 0.2 0.15 q= 90.00%, k=1.645 0.1 q= 95.00%, k=1.960 0.05 q= 99.00%, k=2.576 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 see Fig_4_6.m Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 7 of 25 ECE 3800 HW 4-4.2 A very large population of bipolar transistors has a current gain with a mean value of 120 and a standard deviation of 10. The value of current gain may be assumed to be independent Gaussian random variables. a) Find the confidence limits for a confidence level of 90% on the sample mean if it is computed from a sample size of 150. X k Xˆ X k n n Two sided test at 90% means that k = 1.645. k n 1.645 10 150 1.343 120 1.343 Xˆ 120 1.343 b) Repeat part (a) if the sample size is 21. Two sided test at 90% means that k = 1.645. k n 1.645 10 21 3.590 120 3.590 Xˆ 120 3.590 A noticeable concern with “confidence level”: As the confidence level increases toward 1.0, the range of allowable/acceptable values is increasing. Caution: all that can be stated is that the measured value is inside or outside a desired confidence interval or level. Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 8 of 25 ECE 3800 HW 4-4.3 Repeat Problem 4-4.2 for a one-sided confidence interval. Restating the problem … Find the value of the current gain above which 90% of the sample means would lie. X k Xˆ n (a) 150 sample size One sided test at 90% means that k 0.9 or Q k 1 0.9 n n Therefore, k = 1.2826 and k n 1.2826 X k n 10 150 1.047 118.95 Xˆ (b) 21 sample size One sided test at 90% means, k = 1.2826 and k n 1.2826 X k n 10 21 2.799 117.20 Xˆ One Tail Bounds Confidence Interval (in %) k or z c : z c z z c 99.99% 99.99% 3.7190 99.9% 99.9% 3.0902 99% 99% 2.3263 95% 95% 1.6449 90% 90% 1.2816 80% 80% 0.8416 75% 75% 0.6745 50% 50% 0 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 9 of 25 ECE 3800 If the sample size is not sufficient or if the “probabilistic variance” is unknown, the Student t-distribution must be used. Appendix F provides tables of t for given v and F based on: v 1 v 1 2 2 2 1 x FT t v v x v 2 t Using the estimated sample mean and the variance of the sample mean: X tS n 1 ~ tS X n tS Xˆ X n 1 ~ tS ˆ XX n X or tS Xˆ n 1 ~ tS X Xˆ n or where Xˆ X Xˆ X t ~ S S n 1 n tc q 100 fT t dt FT tc FT tc for t c t t c , 2-sided tc or tc q 100 fT t dt FT tc for t c t , “right-tail” Student’s t PDF v 1 v 1 2 2 2 1 x FT t v v x v 2 t Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 10 of 25 ECE 3800 Reading the T-Distribution tables On-line tables available at http://www.statsoft.com/textbook/sttable.html The degrees of freedom left column, v n 1 where n is the sample size Rows are in “level of significance” or 1-percentage. Table F is in percentage. The value of t c for different degrees of freedom, v n 1 , %con int when the area for FT t c 100 L of S C Int 1 2 3 4 5 0.40 0.6 0.324920 0.288675 0.276671 0.270722 0.267181 0.25 0.75 1.000000 0.816497 0.764892 0.740697 0.726687 0.10 0.90 3.077684 1.885618 1.637744 1.533206 1.475884 0.05 0.95 6.313752 2.919986 2.353363 2.131847 2.015048 0.025 0.975 12.70620 4.30265 3.18245 2.77645 2.57058 0.01 0.99 31.82052 6.96456 4.54070 3.74695 3.36493 0.005 0.995 63.65674 9.92484 5.84091 4.60409 4.03214 0.0005 0.9995 636.6192 31.5991 12.9240 8.6103 6.8688 6 7 8 9 10 0.264835 0.263167 0.261921 0.260955 0.260185 0.717558 0.711142 0.706387 0.702722 0.699812 1.439756 1.414924 1.396815 1.383029 1.372184 1.943180 1.894579 1.859548 1.833113 1.812461 2.44691 2.36462 2.30600 2.26216 2.22814 3.14267 2.99795 2.89646 2.82144 2.76377 3.70743 3.49948 3.35539 3.24984 3.16927 5.9588 5.4079 5.0413 4.7809 4.5869 11 12 13 14 15 0.259556 0.259033 0.258591 0.258213 0.257885 0.697445 0.695483 0.693829 0.692417 0.691197 1.363430 1.356217 1.350171 1.345030 1.340606 1.795885 1.782288 1.770933 1.761310 1.753050 2.20099 2.17881 2.16037 2.14479 2.13145 2.71808 2.68100 2.65031 2.62449 2.60248 3.10581 3.05454 3.01228 2.97684 2.94671 4.4370 4.3178 4.2208 4.1405 4.0728 16 17 18 19 20 0.257599 0.257347 0.257123 0.256923 0.256743 0.690132 0.689195 0.688364 0.687621 0.686954 1.336757 1.333379 1.330391 1.327728 1.325341 1.745884 1.739607 1.734064 1.729133 1.724718 2.11991 2.10982 2.10092 2.09302 2.08596 2.58349 2.56693 2.55238 2.53948 2.52798 2.92078 2.89823 2.87844 2.86093 2.84534 4.0150 3.9651 3.9216 3.8834 3.8495 21 22 23 24 25 0.256580 0.256432 0.256297 0.256173 0.256060 0.686352 0.685805 0.685306 0.684850 0.684430 1.323188 1.321237 1.319460 1.317836 1.316345 1.720743 1.717144 1.713872 1.710882 1.708141 2.07961 2.07387 2.06866 2.06390 2.05954 2.51765 2.50832 2.49987 2.49216 2.48511 2.83136 2.81876 2.80734 2.79694 2.78744 3.8193 3.7921 3.7676 3.7454 3.7251 26 27 28 29 30 0.255955 0.255858 0.255768 0.255684 0.255605 0.684043 0.683685 0.683353 0.683044 0.682756 1.314972 1.313703 1.312527 1.311434 1.310415 1.705618 1.703288 1.701131 1.699127 1.697261 2.05553 2.05183 2.04841 2.04523 2.04227 2.47863 2.47266 2.46714 2.46202 2.45726 2.77871 2.77068 2.76326 2.75639 2.75000 3.7066 3.6896 3.6739 3.6594 3.6460 inf 0.253347 0.674490 1.281552 1.644854 1.95996 2.32635 2.57583 3.2905 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 11 of 25 ECE 3800 Examples of use: Exercise 4-4.2 A very large population of resistor values has a true mean of 100 ohms and a sample standard deviation of 4 ohms. Find the confidence interval on the sample mean for a confidence level of 95% if it is computed from: a) a sample size of 100. v = 99 Using v=60 (no 100 given) and F=0.975 (2 sided test) on p. 436, t=2.00. Therefore ~ ~ X t S Xˆ X t S n n ~ 4 tS 2.00 0.8 n 100 100 0.8 Xˆ 100 0.8 99.2 Xˆ 100.8 Using v=120 (no 100 given) and F=0.975 (2 sided test) on p. 436, t=1.98. ~ 4 tS 1.98 0.792 n 100 99.208 Xˆ 100.792 b) a sample size of 9. v=8 Using v=8 and F=0.975 (2 sided test) on p. 436, t=2.306. Therefore ~ ~ X t S Xˆ X t S n n ~ 4 tS 2.306 3.075 n 9 100 3.075 Xˆ 100 3.075 These answers differ from the text based on using two-sided (97.5) vs. single sided (95.0). If you use 1- sided: t= 1.86 F=0.95 (90% 2-sided interval), then one of the textbook solution can be recognized! Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 12 of 25 ECE 3800 HW 4-4.2 A very large population of bipolar transistors has a current gain with a mean value of 120 and a standard deviation of 10, The value of current gain may be assumed to be independent Gaussian random variables. b) Repeat part (a) if the sample size is 21. Two sided test at 90% means that k = 1.645. k n 1.645 10 21 3.590 120 3.590 Xˆ 120 3.590 If the variance was an estimated variance … instead of a known variance. v = 20 Using v=20 and F=0.95 (2 sided test) on p. 436, t=1.725. Therefore ~ ~ X t S Xˆ X t S n n ~ 10 tS 1.725 3.764 n 21 100 3.764 Xˆ 100 3.764 Notice that using an estimate variance results in a greater range of values (differences in the density functions). Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 13 of 25 ECE 3800 Skill 17-2 A cereal vendor’s quality control department has just tested a random sample of 10 “20 ounce” boxes of Oat Flakes by weighing them in order to see if their 20 ounce claim is to be believed. Their report, to be forwarded to management, must include a 95% confidence interval as to the population mean. a) Find the unbiased mean and standard deviation b) Determine the 95% confidence interval of the mean (by using the Student’s-t table). c) In general, if the confidence interval becomes tighter (smaller), would the confidence level increase or decrease? Measurement Data: 19, 18, 21, 21, 18, 22, 17, 19, 20, and 17. 1 Xˆ n a) Sample Mean n Xi , where X i are random variables with a pdf. i 1 1 192 Xˆ 19 18 21 21 18 22 17 19 20 17 19.2 10 10 Unbiased variance n 1 ~ ES2 X i Xˆ n 1 i 1 2 1 27.6 ~ E S 2 0 .2 2 1 .2 2 1 .8 2 1 .8 2 1 .2 2 2 .8 2 2 .2 2 0 .2 2 1 .8 2 2 .2 2 3.067 9 9 27.6 ~ S 3.067 1.751 9 v=9 Using v=9 and F=0.975 (2 sided test) on p. 436, t=2.262. Therefore ~ ~ X t S Xˆ X t S n n ~ 1.751 tS 2.262 1.252 n 10 19.2 1.252 Xˆ 19.2 1.252 17.948 Xˆ 20.452 (c) As the confidence interval becomes tighter (smaller) [p% going down! ], the confidence level/interval decreases. Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 14 of 25 ECE 3800 4-5 Hypothesis Testing Now that we have concepts of acceptable intervals based on collected statistics, we can relate this to whether a statement based on the statistics is acceptable or not … Statistical Decision Making. A statement is made in terms of a Hypothesis. The goal of interpreting the measured values is to accept or reject the Hypothesis. Examples: the coin being flipped is fair two random noise signals (processes) have the same mean the lifetime stated on a light bulb is an appropriate description of the mean the lifetime stated on the light bulb is an appropriate description of a minimum the sample mean measured for a set of components is within the 95% confidence interval of the desired mean value (1% resistors are within 1% of value with 95% confidence) When there is only one Hypothesis, it is referred to as the null Hypothesis (H0). There are potentially multiple Hypotheses, we can generate criteria to accept one over another (establishing thresholds for decision making). Null Hypothesis Testing A significance test based on a decision rule must be determined. The significance test establishes a level, potentially the confidence level or confidence interval, to determine whether to accept or reject the hypothesis. This is stated as a decision rule where Accept H0: if the computed value “passes” the significance test. Reject H0: if the computed value “fails” the significance test. In general this involves a significance test that defines and equation or function that can be computed based on the measured data (statistics). A performance threshold is then defined that defines a Accept/Reject or pass/fail boundary. Inside or outside the confidence interval. Acceptably meet desired criteria or not. Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 15 of 25 ECE 3800 Example: (p. 174) A capacitor manufacturer claims that the capacitors have a mean breakdown voltage of 300V or greater. We establish a significance test of a 99% confidence level. (In this case, we are looking for values above the minimum confidence level as acceptable, a one sided test. ) In testing, 100 capacitors are tested (note this is a destructive test) and a mean value of 290V with an unbiased sample standard deviation of 40 V. Is the Hypothesis accepted or rejected? The significance test at the 99% confidence level requires ~ X t S Xˆ n Using v=99 and F=0.99 (1 sided test) on p. 436, t=2.358 (using 120). Therefore ~ 40 tS 2.358 9.432 n 100 300 9.432 Xˆ or 290.568 Xˆ The measurement results cause us to reject the Hypothesis! Therefore, we would say that the mean claimed is not valid. Textbook example: the textbook did this for a Gaussian distribution, assuming that 100 is getting close enough to not worry about the T distribution. Final note: a 99% confidence level was selected for the significance test in this example, if a 99.5% level were selected; the Hypothesis would have been accepted! (300-10.468=289.532) To alleviate (?) this confusion, a “level of significance” can be defined that is 1.0 - confidence level. This would say that the above test was to a 1% level of significance. Therefore a 1% level of significance is rejected but a 0.5% level of significance would be accepted. Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 16 of 25 ECE 3800 Example: Testing a fair Coin The binomial random variable allows us to develop a test to see if a coin is fair when flipped. We need to count the number of heads that occur and test at a level of significance of 5% or a 95% confidence level. Accept H0: if the number of heads inside 95% confidence level, it is fair. Reject H0: if the number of heads outside 95% confidence level, it is not fair. We assume that number of trials has lead to Gaussian statistics, except for prior knowledge of the mean and variance expected for a fair coin (p=0.5) using the binomial random variable. For this distribution, the mean and variance are E X X n p or Var X 2 n p 1 p Assuming 100 trials and that the statistics have become Gaussian. k k X Xˆ X n n We use a two-sided test, therefore k=1.96 and we have k 1.96 100 0.5 0.5 1.96 100 4.9 10 4 n 100 Therefore the test region for the Hypothesis is 50 4.9 Xˆ 50 4.9 or 45.1 Xˆ 54.9 Now you have a criteria to establish if a coin is fair …. From Matlab: Pr(46<=x<=54) = 0.631798 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 17 of 25 ECE 3800 Example: Hypothesis testing in communications Signal plus noise input to the receiver. “Digital symbol” receiver outputs a value corresponding to a symbol plus noise. Hypothesis testing establishes the rules to select one symbol as compared to another. Incorrect selections results in symbol errors and digital bit errors once the symbols are translated into bits. r t s i t nt , for i T t i 1 T Bernard Sklar, “Digital Communications, Fundamentals and Applications,” Prentice Hall PTR, Second Edition, 2001. Appendix B. For each symbol a detection statistic is generated for the symbol period T. z T ai T n0 T Hypothesis testing then determines the estimated symbol value from the detection statistic. The number of Hypothesis is equivalent to the number of possible symbols transmitted. Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 18 of 25 ECE 3800 4-6 Curve Fitting and Linear Regression Fitting lines to scatter plots. Data provided as (x,y) pairs. Is there a function that goes through all the points? Yes … If you want to use a polynomial of degree n-1 for n pairs! But we usually want simple curves to represent the data, like lines or parabolas, etc. where y a bx or y a bx cx 2 To fit the curve we want to minimize the following function of the polynomial (thus minimizing the squared error): y a b x c x n i i i 2 2 i 1 For a linear regression (a line), we have n err yi a b xi 2 i 1 To minimize for the values a and b, take the derivatives and set them equal to zero. Then solve for a and b: d err da n 2 yi a b xi 0 i 1 n y i 1 d err db n i n a b xi i 1 n 2 yi a b xi xi 0 i 1 n n n i 1 i 1 i 1 y i xi a xi b xi 2 Solving for the minimum (from d/da) a n 1 n y i b xi n i 1 i 1 and (substituting a into d/db) n n n n y i xi xi y i i 1 i 1 b i 1 2 n n 2 n xi xi i 1 i 1 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 19 of 25 ECE 3800 Proof: Working with d/da n n n y a b x n a b x i i 1 a i i 1 n n i 1 i 1 y i b xi n i 1 i n 1 n y i b x i n i 1 i 1 Substituting a into d/db and solve for b n n n y i xi a xi b xi i 1 n y i 1 i xi i 1 2 i 1 n 1 2 y i b x i x i b x i n i 1 i 1 i 1 i 1 n n n 2 n n n 2 1 n 2 1 1 n 2 y i xi y i xi b xi b xi b xi xi n i 1 n i 1 n i 1 i 1 i 1 i 1 i 1 n n n b y i xi i 1 n x i 1 2 i n 1 n y i xi n i 1 i 1 1 xi n i 1 n 2 n n n n y i xi y i xi i 1 i 1 i 1 2 2 n xi xi i 1 i 1 n n and using b in the d/da equation solution n n n n y x y x i i i i n n 1 n 1 n i 1 i 1 i 1 a y i b x i y i xi 2 n n i 1 i 1 i 1 n i 1 n 2 n x xi i i 1 i 1 2 n n 1 n 1 n n n y i xi y i xi y i xi xi y i xi n i 1 n i 1 i 1 i 1 i 1 i 1 i 1 a i 1 2 n n 2 n xi xi i 1 i 1 n n 2 2 a n n i 1 i 1 n n y i xi xi y i xi 2 i 1 i 1 n xi xi i 1 i 1 n 2 n 2 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 20 of 25 ECE 3800 Alternate formulation (using the statistical mean of x and y) 1 n y i xi Yˆ Xˆ n i 1 b 2 1 n 2 xi Xˆ n i 1 Yˆ 1n x Xˆ 1n y x a 1 x Xˆ n n n 2 i i 1 i 1 n i i 2 2 i i 1 Also to form estimates of the correlation and covariance: R XY S X2 1 n y i xi n i 1 C XY R XY Yˆ Xˆ 1 n 2 xi Xˆ n i 1 2 SY2 2 C XY S X2 SY2 b a 1 n 2 ˆ yi Y n i 1 C XY S X2 Yˆ S Xˆ R 2 X S XY 2 X See HW_4_6_1.m Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 21 of 25 ECE 3800 Correlation of a discrete random variables If we assume that every x is equally likely, the pmf of the functions has the same value for x, 1/n. Repeated pairs simply sum the probability at the point. So, Mean or 1st Moment 2nd Moment x x f x, y dx E X EX EX x x n x dx x xi x dx n i 1 n 1 X E X n f x, y dx 2 2 2 EX 2 xi EX 2 i 1 i i 1 n n 1 R XX n n xi 2 i 1 2nd Central Moment E X X x X 2 f x, y dx x xi x X dx n 2 i 1 2 1 E X X n X 2 C XX E X X i 1 n xi X n 2 n 2 1 E X X n 2 E X X 2 1 E X X n 2 n i 1 n xi 1 n xi 2 2 xi X X 2 n i 1 2 2 X X i 1 1 n 2 1 X 2 C XX n i 1 2 xi X n 2 2 n xi 2 X i 1 2 2 1 xi n 1 n 1 n n X 2 i 1 n xi 2 X 2 i 1 n 2 1 xi xi n i 1 i 1 n 2 2 n 2 1 xi xi R XX X 2 n i 1 i 1 n Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 22 of 25 ECE 3800 Correlation between discrete random variables For two sequences or paired groupings (x,y). If we assume that every (x,y) pair is equally likely, the pmf of the functions has the same value for every pair, 1/n. Repeated pairs simply sum the probability at the point. So, for correlation, EX Y x y f x, y dx dy for xi , yi and f xi , yi 1 for all pairs, i 1to n n n E X Y x y x xi y y i i 1 dx dy n R XY E X Y 1 n n xi y i i 1 Defining the cross correlation x X y Y f x, y dx dy E X X Y Y x X y Y x xi n y yi dx dy i 1 n E X X Y Y 1 E X X Y Y n E X X Y Y 1 E X X Y Y n 1 n n xi X yi Y i 1 n xi yi xi Y yi X X Y i 1 n xi y i n n X Y n n X Y n n X Y 1 1 1 i 1 1 C XY E X X Y Y n n xi y i X Y i 1 Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 23 of 25 ECE 3800 The Discrete Correlation coefficient For two sequences or paired groupings (x,y). If we assume that every (x,y) pair is equally likely, the pmf of the functions has the same value for every pair, 1/n. Repeated pairs simply sum the probability at the point. So, X X Y Y x X y Y E f x, y dx dy Y X Y X X X Y Y x X y Y x xi y y i E dx dy n Y X Y X i 1 n n X X Y Y 1 xi X y i Y r E n Y X Y X i 1 n X X Y Y 1 1 E xi y i x i Y y i X X Y n X Y X Y i 1 X X Y Y 1 E Y X Y X 1 n 1 1 1 xi yi n X Y n X Y n X Y n n n n i 1 X X Y Y r XY E X Y X X Y Y r XY E Y X 1 n n xi y i X Y i 1 X Y C XY X Y 1 n 1 n 1 n xi y i xi y i n i 1 n i 1 n i 1 2 1 n 2 1 n 1 n 2 1 n xi xi yi yi n i 1 n i 1 n i 1 n i 1 2 The text defines this as Pearson’s r, the linear correlation coefficient between two sets of data! Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 24 of 25 ECE 3800 Based on the discrete terms, linear estimation becomes Then, a Yˆ R ˆ ˆ ˆ XX X R XY Y R XX X R XY 2 C XX R XX Xˆ and b C C Xˆ R XY Yˆ Xˆ R XX 2 XY XX Pavlovian conditioning for sampled data … compute x: Mean, 2nd moment, variance ( X , R XX , and X ) y: Mean, 2nd moment, variance ( Y , RYY , and Y ) x and y: R XY , C XY , and XY xi xi 2 E X 2 R XX 1 X 2 C XX n 1 n i 1 n i 1 2 n 2 1 xi xi R XX X 2 n i 1 i 1 n 1 R XY E X Y n n 1 X E X n C XY E X X Y Y XY n xi y i 1 n i 1 n xi y i X Y i 1 C XY X Y Notes and figures are based on or taken from materials in the course textbook: Probabilistic Methods of Signal and System Analysis (3rd ed.) by George R. Cooper and Clare D. McGillem; Oxford Press, 1999. ISBN: 0-19-512354-9. B.J. Bazuin, Spring 2015 25 of 25 ECE 3800