DAO1704/DSC1007 Lecture 4 Discrete Probability Distribution Agenda • Binomial Distribution • Poisson Distribution • Covariance and correlation • Joint probability distributions and independence • Linear Functions of a Random Variable • Sums of Random Variables Binomial Distribution Summary Measures of Probability Distributions Where do probability distributions come from? I. Empirically (from data) Example: KFC sells chicken in “buckets” of 2, 3, 4, 8, 12, 16 or 20 pieces. Over the last week, orders for fried chicken had the following data: pieces in order 2 3 4 8 12 16 20 orders 170 200 260 165 120 50 35 1,000 xi 2 3 4 8 12 16 20 Probability 0.170 0.200 0.260 0.165 0.120 0.050 0.035 Let X = number of pieces of chicken in an order. Develop a probability distribution for X. • If there are 10 tutors participating in this experiment, what is the probability that 3 international students are chosen? Binomial Distribution n independent trials Each trial has exactly two outcomes : “success” or “failure” Each trial has same probability: success p, failure 1ο p Binomial Distribution We also say that X obeys a binomial distribution with parameters n and p : Binomial (n, p) or B(n,p) Binomial Distribution π πΏ=π₯ = π! π₯! π−π₯ ! π π₯ (1 − π)π−π₯ for π₯ = 0, 1, . . . , π Binomial Distribution Expected Value and Variance If X obeys a binomial distribution with parameters n and p , then the mean, variance and standard deviation of X are: Mean Variance Std deviation πΈ πΏ = ππΏ = ππ πππ πΏ = ππΏ2 = ππ(1 − π) ππΏ = ππ(1 − π) Binomial Distribution π π=π₯ = π! π₯! π−π₯ ! π π₯ (1 − π)π−π₯ for x = 0, 1, . . . , n EXCEL Function : BINOMDIST (x, n, p, cumulative) cumulative = 0 ( or FALSE) ο P ( X = x ) 1 ( or TRUE) ο P ( X ο£ x ) Binomial Distribution EXCEL Function : BINOMDIST (x, n, p, cumulative) Example Summary : number of lasers (out of 15) that will pass the test X οΎ Binomial (15, 0.75) P (X = 15) = 0.013363 P (X ≥ 14) = 0.0802 P (X = 15) = BINOMDIST (15, 15, 0.75, 0) P (X ο³ 14) = 1 ο P (X ο£ 13) = 1 ο BINOMDIST (13, 15, 0.75, 1) Application of Binomial Distribution It all starts with a mysterious phone call… Hi Dear friend. This is Mike. I am an investment broker from the L.Q.Z. Company. … Yes! L.Q.Z. Now you remember ah. The L.Q.Z. loh. I have a very good news to share with you. We have done a very thorough research on 400 mutual funds. You know what? We found a star fund that has beaten a standard market index in 37 out of 52 weeks. It all starts with a mysterious phone call… YES! 37 out of 52 weeks! If you invested in this fund, a simple math can tell you how much you can earn. Your math must be good, right? Sure one… So… why bother studying so hard? Invest with me and you will be rich soon. Question 1 • We say a fund beats the market purely by chance if each week the fund has a fifty-fifty chance of beating the market index, independently of its performance in other weeks. • What is the probability for such fund to beat the market at least 37 out of 52 weeks? Question 2 • Suppose that all the 400 funds beat market purely by chance. What is the probability that the best of them beats the market at least 37 out of 52 weeks? • Conclusion? Poisson Distribution Bortkiewicz(1898) The Law of Small Numbers Year GC C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C14 C15 1875 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1876 2 0 0 0 1 0 0 0 0 0 0 0 1 1 1877 2 0 0 0 0 0 1 1 0 0 1 0 2 0 1878 1 2 2 1 1 0 0 0 0 0 1 0 1 0 1879 0 0 0 1 1 2 2 0 1 0 0 2 1 0 1880 0 3 2 1 1 1 0 0 0 2 1 4 3 0 1881 1 0 0 2 1 0 0 1 0 1 0 0 0 0 1882 1 2 0 0 0 0 1 0 1 1 2 1 4 1 1883 0 0 1 2 0 1 2 1 0 1 0 3 0 0 1884 3 0 1 0 0 0 0 1 0 0 2 0 1 1 1885 0 0 0 0 0 0 1 0 0 2 0 1 0 1 1886 2 1 0 0 1 1 1 0 0 1 0 1 3 0 1887 1 1 2 1 0 0 3 2 1 1 0 1 2 0 1888 0 1 1 0 0 1 1 0 0 0 0 1 1 0 1889 0 0 1 1 0 1 1 0 0 1 2 2 0 2 1890 1 2 0 2 0 1 1 2 0 2 1 1 2 2 1891 0 0 0 1 1 1 0 1 1 0 3 3 1 0 1892 1 3 2 0 1 1 3 0 1 1 0 1 1 0 1893 0 1 0 0 0 1 0 2 0 0 1 3 0 0 1894 1 0 0 0 0 0 0 0 1 0 1 1 0 0 Poisson Distribution • Poisson Distribution was derived by the French Mathematician Siméon Poisson in 1837. • His name is one of the 72 names inscribed on the Eiffel Tower. Poisson Distribution Useful for modelling the number of occurrences of an event over a specified interval of time or space. Examples : •number of customer orders received in one hour •number of failures in a large computer system per month Properties •Probability of an occurrence is the same for any two intervals of equal length. •Occurrences in nonoverlapping intervals are independent of one another. Poisson Distribution • Examples and Applications: – The number of phone calls arriving at a call centre per minute. – The number of goals in sports involving two competing teams. – The number of infant death per year. – The number defaults found per newly TOP flat. –… Poisson Distribution A random variable X is said to be a Poisson r.v. with parameter ο¬ (> 0) if it has the probability function π π=π = π −π ππ π! for π = 0, 1, 2, . . . Note: X is a discrete r.v. that takes on values 0, 1, 2, . . . π π=0 = π −π π0 0! = π −π π π=1 = π −π π1 1! = π −π π ο· ο· ο· ο· ο· ο· Poisson Distribution A random variable X is said to be a Poisson r.v. with parameter ο¬ (> 0) if it has the probability function π π=π = π −π ππ π! It can be shown that Mean E(X) = ο¬ Variance Var (X) = ο¬ for π = 0, 1, 2, . . . Thus, parameter ο¬ can be interpreted as the average number of occurrences per unit time or space Poisson Distribution X is said to be a Poisson r.v. with parameter ο¬ (> 0) if π π=π = π −π ππ π! for π = 0, 1, 2, . . . Example Patients arrive at the A & E of a hospital at the average rate of 6 per hour on weekend evenings. What is the probability of 4 arrivals in 30 minutes on a weekend evening? Can expect patient arrivals to be approximately Poisson. Average arrival rate is 6 / hour. Let X be the number of patient arrivals in 30 minutes X is Poisson with parameter ο¬ = 3 π π=4 = π −π π4 4! = .0497871 (81) 24 ≅ 0.1680 Poisson Distribution π π=π = π −π ππ π! for π = 0, 1, 2, . . . Excel Function : POISSON (x, ο¬, cumulative) cumulative = 0 ( or FALSE) ο P ( X = x ) 1 ( or TRUE) ο P ( X ο£ x ) Poisson Distribution Excel Function : POISSON (x, ο¬, cumulative) Example Summary : Let X be the number of patient arrivals in 30 minutes X is Poisson with parameter ο¬ = 3 P (X = 4) = 0.1680 P (X = 4) = POISSON (4, 3, 0) Example • In 1898 L. J. Bortkiewicz published a book entitled The Law of Small Numbers. He used data collected over 20 years to show that the number of soldiers killed by horse kicks each year in each corps in the Prussian cavalry followed a Poisson distribution with a mean of 0.63. – What is the probability that at least one deaths caused by horse kick in a corps in a year? – What is the probability that there is no death caused by horse kick in a corps over 6 years? Linear Function of a Random Variable Linear Functions of a Random Variable Example Suppose daily demand for croissants at a bakery shop is given by Daily Demand Probability 60 0.05 64 0.15 68 0.20 72 0.25 75 0.15 77 0.10 80 0.10 Let X = daily demand for croissants We can easily compute E (X) = 71.15 Var (X) = 29.5275 Suppose it costs $135 per day to run the croissant operation, and that the cost of producing one croissant is $0.75. Daily cost of croissant operations = 0.75 X + 135 Linear Functions of a Random Variable Example X Probability Y = .75X + 135 60 0.05 180.00 64 0.15 183.00 68 0.20 186.00 72 0.25 189.00 75 0.15 191.25 77 0.10 192.75 80 0.10 195.00 E (X) Var (X) = Var (Y) = 0.75 E (X) + 135 = 0.752 Var (X) 71.15 29.5275 Y = 0.75 X + 135 E (Y) = ? Var (Y) = ? How are the means and variances related ? E (Y) = Linear Functions of a Random Variable Note: If Y = a X + b E (Y) = a E (X) + b Var (Y) = a2 Var (X) Formulas apply to continuous r.v.’s as well Covariance and Correlation Covariance and Correlation How do we summarize the relationship between two variables? Specifically : how do we summarize what we observe in a scatter plot? Examples: Unemployment rate vs Crime rate Stock market vs Property market Time spent on DSC1007 vs DSC1007 Exam marks Covariance and Correlation Q : How do we describe the relationship between two rv’s? Example: Chain of upscale cafés sells gourmet hot coffees and cold beverages. From past sales data, daily sales at one of their café obey the following probability distribution for (X, Y), X = # hot coffees, Y = # cold beverages sold per day Probability No. of Hot Coffees Sold No. of Cold Drinks Sold pi xi yi 0.10 0.10 0.15 0.05 0.15 0.10 0.10 0.10 0.10 0.05 360 790 840 260 190 300 490 150 550 510 360 110 30 90 450 230 60 290 140 290 Mean Standard Deviation From the scatter plot, we can conclude that the sales of hot coffees and the sales of cold beverage are negatively related. Is it correct? Covariance and Correlation We now define the covariance of two random variables X and Y with means οX and οY : Probability X Y P ( X = x1, Y = y1 ) x1 y1 P ( X = x 2 , Y = y2 ) x2 y2 P ( X = xN, Y = yN ) xN yN Covariance πΆππ£ πΏ, π = πΈ = πΏ − ππ π − ππ π πΏ = π₯π , π = π¦π π π₯π − ππ π¦π − ππ Covariance and Correlation Covariance πΆππ£ πΏ, π = πΈ = πΏ − ππ π − ππ π πΏ = π₯π , π = π¦π π₯π − ππ π¦π − ππ π Observe from the above that: πΆππ£(πΏ, πΏ) = πΈ πΆππ£ πΏ, π = πΏ − ππ 2 = πππ πΏ πΈ πΏπ − πΈ πΏ πΈ(π) = πΈ πΏπ − ππ ππ The bigger the covariance is, the stronger the relationship is. Is it true? Covariance and Correlation Covariance πΆππ£ πΏ, π = πΈ = πΏ − ππ π − ππ π πΏ = π₯π , π = π¦π π₯π − ππ π¦π − ππ π We introduce a standardized measure of interdependence between two rv’s : Correlation πΆππ£(πΏ, π) πΆπππ πΏ, π = ππ ππ Comments : • The measure of correlation is unit-free. • Corr (X, Y) is always between ο1.0 and 1.0 Covariance and Correlation Correlation Corr (X,Y) = 1.0 = 0 = ο 1.0 πΆππ£(πΏ, π) πΆπππ πΏ, π = ππ ππ perfect positive linear relationship no linear relationship between X and Y perfect negative linear relationship Covariance and Correlation If higher than average values of X are apt to occur with higher than average values of Y, then Cov(X, Y) > 0 and Corr(X, Y) > 0. X and Y are positively correlated. If higher than average values of X are apt to occur with lower than average values of Y, then Cov(X, Y) < 0 and Corr(X, Y) < 0. X and Y are negatively correlated. Given the following two variables: X Y 1 2 1 3 1 4 1 5 Without calculation, are they correlated? Given the following two variables: X Y 1 2 1 3 1 4 1 5 2 1 3 1 4 1 5 1 Without calculation, are they correlated? Correlation VS Causality Covariance and Correlation Correlation is not the same as Causality! Common fallacy A occurs in correlation with B Therefore : A causes B Reverse Causation The more firemen fighting fire, The bigger the fire is observed to be Therefore, firemen cause fire. Bidirectional Causation Higher jobless claims causes declining of stock market Third Factor Causation • Young children who sleep with the light on are much more likely to develop myopia in later life. • Therefore, sleeping with the light on causes myopia. Third Factor Causation • As ice cream sales increase, the rate of drowning deaths increases sharply. • Therefore, ice cream causes drowning. Pure Coincidence David Leinweber’s Finding Guess: what is the indicator with the most statistically significant correlation with the S&P 500 index? Butter production in Bangladesh Stupid Data Miner Tricks: Overfitting the S&P 500 Explain this??? • The University of California, Berkeley was sued for bias against women who had applied for admission to graduate school here. Applicants Admitted Men 8442 44% Women 4321 35% Department Men Women Applicants Admitted Applicants Admitted A 825 62% 108 82% B 560 63% 25 68% C 325 37% 593 34% D 417 33% 375 35% E 191 28% 393 24% F 272 6% 341 7% Simpson’s Paradox • Simpson’s Paradox is a paradox in which a correlation present in different group is reversed when the groups are combined. Explanation • Women tended to apply to competitive departments with low rates of admission even among qualified applicants, whereas men tended to apply to less competitive departments with high rates of admission among the qualified applicants. Joint Probability Distribution Covariance and Correlation Q : How do we describe the relationship between two rv’s? Example: Chain of upscale cafés sells gourmet hot coffees and cold beverages. From past sales data, daily sales at one of their café obey the following probability distribution for (X, Y), X = # hot coffees, Y = # cold beverages sold per day Probability No. of Hot Coffees Sold No. of Cold Drinks Sold pi xi yi 0.10 0.10 0.15 0.05 0.15 0.10 0.10 0.10 0.10 0.05 360 790 840 260 190 300 490 150 550 510 360 110 30 90 450 230 60 290 140 290 Mean Standard Deviation Joint Probability Distributions Consider two random variables X and Y that assume values given by Probability X Y p1 P ( X = x1, Y = y1 ) x1 y1 p2 P ( X = x 2 , Y = y2 ) x2 y2 pN P ( X = xN, Y = yN ) xN yN Denote by f(x i ,y i ) f is called the joint probability distribution function of ( X , Y ) Joint Probability Distributions The concept of independent events leads quite naturally to a similar definition for independent random variables. Two random variables X and Y are said to be independent if P ( X = x ,Y= y ) = P ( X = x )οP(Y= y ) Roughly : X and Y are independent if knowing the value of one does not change the distribution of the other. Thus, if X and Y are independent, then E ( XY ) = E ( X) E ( Y) It follows that if X and Y are independent, then Cov ( X , Y ) = 0 ( or Co r r ( X , Y ) = 0 ) Covariance and Correlation We know : independent random variables are always uncorrelated. But : dependent random variables may also be uncorrelated!! Example : Consider r.v.’s X and Y with the following joint probability distribution P (X , Y) X Y 1/3 1 1 1/3 0 ο1 1/3 ο1 1 Check : are X and Y independent? P(X=1)=1/3 P(Y=1)=2/3 P(X=1, Y=1)=1/3 Check : are X and Y uncorrelated? Sum of Random Variables Sum of Random Variables We have seen that: X is a r.v. ο a X + b is a r.v. E (a X + b) = a E (X) + b Var (a X + b) = a2 Var (X) Suppose now we have two r.v.’s : X and Y Then the sum X + Y is also a r.v. Q: What is the mean and variance of X + Y ? Similarly for the weighted sum aX + bY Note: Formulas apply to continuous r.v.’s as well Sum of Random Variables E(X) = 457, Var (X) = 59,671 E(Y) = 210, Var (Y) = 21,210 Example: Chain of upscale cafés sells gourmet hot coffees and cold beverages. Covat(X,Y) = their ο 27,260 From past sales data, daily sales one of café obey the following probability distribution for (X, Y), X = # hot coffees, Y = # cold beverages sold per day Probability No. of Hot Coffees Sold No. of Cold Drinks Sold pi xi yi 0.10 0.10 0.15 Suppose 0.05: 0.15 0.10 0.10 0.10 0.10 0.05 360 360 790 110 840 30 cold beverages 260 (Y) are $2.50/glass; 90 450 hot coffees (X)190 $1.50/cup. 300 230 490 60 150 290 550 140 510 290 Mean Standard Deviation ο X = 457.00 ο Y = 210.00 244.28 145.64 Sum of Random Variables Mean E(aX + bY) = aE(X) + bE(Y) Variance Var(aX + bY) = a2Var(X) + b2Var(Y) + 2abCov(X,Y) or : Var(aX + bY) = a2Var(X) + b2Var(Y) + 2abο³Xο³Y Corr(X,Y) Note: Formulas apply to continuous r.v.’s as well Sum of Random Variables If X and Y are independent ο Cov(X,Y)= 0 Variance Var(aX + bY) = a2Var(X) + b2Var(Y) + 2abCov(X,Y) Sum of Random Variables cold beverages (Y) are $2.50/glass; hot coffees (X) $1.50/cup. E(X) = 457, Var (X) = 59,671 E(Y) = 210, Var (Y) = 21,210 Cov (X,Y) = ο 27,260 Determine the following : mean and standard deviation of daily sales of cold beverages E(2.5Y) = 2.5 ? E(Y) SD(2.5Y) = ?2.5 SD(Y) = 2.5√Var (Y) mean and standard deviation of daily sales of hot coffees E(1.5X) = 1.5 ? E(X) SD(1.5X) = ?1.5 SD(X) = 1.5√Var(X) mean and standard deviation of total daily sales of all beverages E(1.5X + 2.5Y) = ? SD(1.5X + 2.5Y) = ? Sum of Random Variables cold beverages (Y) are $2.50/glass; hot coffees (X) $1.50/cup. E(X) = 457, Var (X) = 59,671 E(Y) = 210, Var (Y) = 21,210 Cov (X,Y) = ο 27,260 Determine the following : mean and standard deviation of daily sales of cold beverages E(2.5Y) = 2.5 ? E(Y) SD(2.5Y) = ?2.5 SD(Y) = 2.5√Var (Y) mean and standard deviation of daily sales of hot coffees E(1.5X) = 1.5 ? E(X) SD(1.5X) = ?1.5 SD(X) = 1.5√Var(X) mean and standard deviation of total daily sales of all beverages E(1.5X + 2.5Y) = ? SD(1.5X + 2.5Y) = ?√ Var (1.5X + 2.5Y) = √ 1.52 Var(X) + 2.52 Var(Y) + 2*1.5*2.5*Cov(X,Y)