IB Statistics Higher Level Course Companion

O X F O R D I B D I P L O M A M AT H E M AT I C S P R O G R A M M E HI GH E R LE V E L : S TAT I S T I C S C O U R S E C O M PA N I O N Josip Harcet Lorraine Heinrichs Palmira Mariz Seiler Marlene Torres Skoumal 3 Great Clarendon Oxford It furthers and the Oxford New Auckland New With © the is UK Oxford The Database First All rights in without or in as rights right the the must and you British Data in of Oxford. research, scholarship, in Hong Kong City Karachi Nairobi Toronto Korea Turkey of Oxford France Poland Ukraine University Greece Portugal Vietnam Press countries 2014 author have University No part of system, been Press this or permission permitted asserted (maker) of by the Press, in writing law, impose above at the this this or in under be address book of in may in any Oxford terms Enquiries should same Cataloguing publication transmitted, organization. circulate Library other Press rights scope not Thailand mark Republic South 2014 prior must Czech be reproduced, form or agreed with concerning sent to by University the any means, Press, the appropriate reproduction Rights Department, above any other condition Publication on binding any or cover acquirer Data 978-0-19-830485-2 9 8 Printed Paper The University available ISBN 10 the Mexico Japan trade Oxford University You of excellence Salaam Chile Italy the retrieval expressly Oxford of reserved. reprographics outside Taipei certain University a of worldwide es Brazil registered published stored objective in and moral 6DP Melbourne Switzerland a OX2 department Dar Hungary Singapore Oxford Town Madrid Austria Guatemala a publishing Shanghai oflces is York Cape Argentina in by Lumpur Delhi Oxford Press University’s education Kuala Street, University 7 in 6 5 4 Great used in 3 2 1 Britain the production manufacturing process of this book conforms to is the a natural, recyclable environmental product regulations made of the from wood country of grown in sustainable forests. origin. Acknowledgements The p4: publisher would Nathaniel Shutterstock; S. like to thank Butler/NBA/Getty p84: Anton the following Images; p12: for permission to reproduce Kuttig-People/Alamy; Havelaar/Shutterstock; p123: p41: photographs: Jaggat evantravels/Shutterstock; Rashidi/Shutterstock; p128: NASA. p79: Ariwasabi/ Course The IB Diploma resource study gain an study in of aims each and a an designed two-year par ticular IB the of IB. the subject providing in to and They is that the for students deep to critical the Each and and books mirror curriculum use of a in wide mindedness, IB range the IB Inter national inquiring, help to create this end a The a of resources, lear ner of all book can and and and works of to encouraged resources. lear ners more the approach; the aims are They extend addition, the to a with and to the the on the to have provide assessment authoritative without being people world who of for how advice and They requirements are distinctive prescriptive. of They local acquire inter national education and rigorous assessment. through These programmes world to lear ners gover nments challenging become who dierences, is to They develop recognizing encourage active, students across compassionate, understand can guardianship of take also be that other and the people, lifelong with their right. responsibility consequences their and more that for their accompany own actions and the them. the peaceful They understand and appreciate their world. cultures and their natural curiosity . perspectives, personal histories, and are open to conduct inquir y and this in lear ning. love of their and traditions of to other and communities. They are accustomed to and and evaluating a range of points of view , and They lear ning will willing to grow from the experience. be They show empathy , compassion, and respect lives. explore and values, They concepts, ideas, the global in-depth signicance. knowledge and needs and feelings of others. They have a and In commitment to ser vice, and act to make a so positive they course and programmes develop develop who, independence throughout that specic protocol. personal doing, suggestions Companions honesty towards issues variety fur ther be: lear ning Knowledgable and required a provided. Course academic Caring sustained and other are from additional book with IB respect. schools, shared necessar y show enjoy for on are actively the conclusions each are of and seeking research conjunction Prole better develop skills in students draw in research individuals acquire to given In the Inquirers action, IB young peaceful and people and create strive used Suggestions to own IB creativity , issues viewing and caring prog rammes minded humanity help be indeed, Open-minded planet, and of statement organizations IB essay , theor y inter national prole Baccalaureate IB extended requirements, thinking. whole-course learner inter nationally common philosophy understanding the IB aim of better inter national The IB knowledgable intercultural and terms mission The To the the core (CAS). reading understanding wider ser vice guidance The Programme materials pur pose philosophy a course while the Diploma knowledge, from subject connections opportunities help illustrates reect making will expected encourage are students Programme Programme way Companions suppor t Diploma what They IB by a deni tion Course subject. of Diploma content of approach of their understanding presenting and Programme materials throughout of Companion dierence to the lives of others and to the develop environment. understanding across a broad and balanced range of disciplines. Thinkers skills Risk-takers They critically approach ethical exercise and complex initiative creatively to problems, in applying recognize and make thinking uncer tainty the and strategies. defending decisions. Communicators They understand and express information condently and creatively in their one language and in a variety of modes They with work eectively and willingly They sense of act with fair ness, integrity justice, and and honesty , respect for of the roles, ideas, in understand physical, and the impor tance emotional of balance well-being for themselves and to achieve others. They with individual, groups, and lear ning and give thoughtful experience. consideration They are able to to their assess a understand their strengths and limitations in order the to dignity new ar ticulate in and strong and others. own Principled explore and have belief s. They Reective collaboration to brave and of personal communication. spirit are situations forethought, more intellectual, than of They unfamiliar and ideas Balanced and approach courage independence and reasoned, They with communities. suppor t their lear ning and personal development. iii A It note is of vital of academic impor tance appropriately credit to the acknowledge owners of honesty and What constitutes information is Malpractice when that information is used in your work. owners of ideas (intellectual proper ty) in, you rights. To have an authentic piece it must be based on your individual in ideas with the work of others Therefore, all assignments, oral, completed for assessment in, an or may unfair one or more assessment component. includes plagiarism and collusion. is dened as the representation of the must or work of another person as your own. written The or results gaining fully ideas acknowledged. that student and Plagiarism original any of Malpractice work, or have advantage proper ty behaviour After result all, malpractice? use following are some of the ways to avoid your plagiarism: own language used or and referred quotation or expression. to, whether paraphrase, appropriately Where in the such sources form sources of are direct must ● be do I and suppor t acknowledged. ideas one’s of another arguments person must used to be acknowledged. ● How Words acknowledge the work Passages that are enclosed within quoted verbatim quotation marks must be and acknowledged. of others? ● The way that you acknowledge that you CD-ROMs, Inter net, used the ideas of other people is through the footnotes and messages, web sites on the and any other electronic media must use be of email have treated in the same way as books and bibliographies. jour nals. (placed Footnotes at the bottom of a page) or ● endnotes (placed at the end of a document) are The sources of illustrations, be provided when you quote or paraphrase document, or closely summarize provided do to in another need provide a document. footnote for is par t of a denitions do are the par t of ‘body not of need and assumed be Words of should That resources ‘Formal’ several means accepted usually use that involves into that footnoted include you a in as formal your should of CDs the categories and information as use one list your work can how nd bibliography is music, ar ts, of a of par t acknowledged. be (e.g. of a the of the ● dened student. that allowing for you books, your own work. lm, and dance, where the a work takes place, as suppor ting malpractice by This includes: ● your work assessment duplicating by work components to be copied another for and/or or submitted student dierent diploma assessment requirements. Inter net-based ar t) and reader same compulsor y is or forms of malpractice include any action providing gives you an unfair advantage or aects the viewer of another student. Examples include, information. in the unauthorized material into an examination extended room, misconduct during essay . falsifying iv not visual must taking A whether or use results of are of This resources ar ticles, works to they creative that full if they Other resources, be work. presentation. separating newspaper must knowledge. used forms dierent magazines, you material is, another the ar t, ar ts, Collusion Bi bliographies similar information knowledge’. to graphs, Y ou theatre that data, the ● not maps, programs, from acknowledged information photographs, computer audio-visual, another all to a CAS record. an examination, and About The new Option: book. the syllabus Statistics Each sections for is chapter with the book Mathematics thoroughly is divided following Level Questions this strengthen Higher covered into in through lesson-size given you analytical appropriate in the style Note: Companion emphasis is improved concepts the will prociency thinking. placed guide in that and About Josip for Harcet 20 years. dierent teach in review would you has Inter national coordinator been He also as since a and solutions graphics to examples display are calculator. is a growing, contextual, enables ever technology students to become US spelling a a for schools been used, with IB style for of terms. all The and as well as denotes examination may the 11 he ser ved chief be used. 12 She workshop the a at for to for this HL IB curriculum as IB she has IB, she review team. 22 review leader has been years. years Inter nal ago She and teaching joined since Assessment working and the then IB has moderator, groups, deputy Tor res-Sk o um a l has including for um for enjoyed over for chief Inter nal moderator, member of 30 various deputy moderator she for 11 mathematics DP Seiler chief and as in a examiner for mathematics. Marlene Bonn the worked workshop Inter nal of community HL time Mariz curriculum senior the Palmira mathematics leader. been leader at curriculum teaching During moderator years and years has programme retur ned as workshop been IB examiner assistant past of has critical for GDC teaching 2002. member lear ners. authors has the and suitable has School. senior Assessment solving deputy well for condence mathematical application teaching Heinrichs mathematics life be Mathematics, as build diculty , through assessment. development of where been After coverage Companion the Zagreb. Lorraine was Course inter national examiner, also problem member, Fur ther the real those full inter nal on their The questions practice new with understanding and a The lifelong mathematical curriculum and the of approach adaptable, topics and in Advice Extension latest skills education entity . integrated the increase History know? changing Course to understanding. Mathematics The designed features: Where Did are ha s years. roles with the for Assessment, leader, curriculum IB During examiner workshop several taug ht this time, IB, HL, senior calculator and review a teams. v Contents Chapter 1 Exploring Introduction 1.1 Probability Cumulative Discrete 1.2 Other Expected The Review Chapter 2 2.1 of Limit Biased and 32 and Central Limi t probability Case I: Case II: of of single two or variable more independent the 42 variables normal 47 random variables mean 64 68 statistical analysis how can methods we make 78 sense of the mean and variance of a normal 81 denition? for the 82 mean inter val inter val for inter val for 84 μ for when μ matched σ when is σ known is pairs for μ when σ is known Hypothesis testing for μ when σ is unknown testing for test exercise matched pairs graph 101 105 109 errors distribution Two-tail 90 100 testing II unknown 85 97 testing Type 79 80 for Hypothesis and data? estimates Condence Normal 58 74 inter vals Signicance 40 41 Theorem Condence Hypothesis Review Theorem models 42 estimators Condence vi 25 variable Condence 18 20 and a information: well-dened I variable algebra Exploring Estimators Type random exercise random 3.4 algebra distribution Unbiased 3.3 geometric variables distributions combination Central Review A a 36 of 3 of functions transformation linear Introduction 12 variance Linear Chapter 5 5 distribution of The 3.2 and 4 12 transformation Sampling decisions quantities Linear 2.3 informed function independent Normal 2.2 make exercise Expectation A 3.1 value Expectation Introduction to 2 distributions generating sum tool distri butions distribution binomial Probability a probabi li ty continuous probability Negative as distribution and Geometric 1.3 further 112 for a one-tail test 115 116 117 Chapter 4 Introduction 4.1 Statistical Bivariate modeling 122 distributions 123 Correlation 124 Correlation Sampling 4.2 Covariance 4.3 Hypothesis and causation 128 distributions 130 135 Proper ties of covariance 136 testing 138 Introduction t-Statistic 4.4 Linear for 138 dependence regression Review exercise of X and Y 139 141 152 Answers 156 Index 161 vii Exploring further 1 probability distributions CHAPTER OBJECTIVES: Cumulative 7.1 distribution Geometric distribution. Probability generating Using probability distribution Before 1 Find the variable, you mode, standard of e.g. a functions generating the sum median, the of Negative of n for both discrete binomial for (Pascal’ s) discrete functions to independent and the random x variance, and variables. of a table mean, and discrete shows discrete 1 random the the standard probability random Find discrete variable X  2 3 4 0.3 0.25 0.35 0. mode, median, deviation random of mean the and following variables given by: a i = variables. mean, x X distributions. distribution. random nd continuous start deviation distribution functions p i − 0  2 3 4 0.3 0. 0.3 0. 0.05 0.5 i P(X = x ) i ⎧5 Mode (X ) = 3, because P(X = 3) = b which is the highest probability of the P( X = x ) = ⎨ ⎪ ⎩ four random Median, P( X ≤ m 1) = variables. = 2 since 0 .3 and P( X ≤ 2) = 0 .55 4 = E (X ) = ∑ x p i i i =1 = 1 × 0. 3 + 2 × 0. 25 + 3 × 0. 35 + 4 × 0. 1 = 2. 25 4 2 σ = Var ( X ) = ∑ 2 x p i − μ i i =1 2 = 1 2 (0.3) + 2 2 (0.25) + 3 2 (0.35) + 4 2 (0.1) 2 = 2 6.05 − 2.25 Exploring further = 0.994 probability distributions − x , ⎪ 0.35, 0.225 x = 1 , 2, 3, 10 0, otherwise 4 2 Find the mode, standard random median, deviation variable, function of given the by a of a e.g. mean the discrete and 2 continuous probability random Find the mode, standard density variable X random is median, deviation variables probability = (x ) dened density ⎧ 3 f x, 0 ≤ x ≤ a 2 f (x ) = , 0 = 2 because it has the maximum 0, ⎪ b at the ≤ x ≤ 2 elsewhere elsewhere π ⎧ point given 2 ⎪ 0, (X ) the 16 ⎩ Mode by function: x ⎨ 4 ⎨ 2 ⎩ and continuous 3 ⎪ ⎪ mean the formula ⎧ 1 ⎪ of end of the f (x ) = cos ( ) 2x , π ≤ x ≤ 4 ⎨ 4 inter val. ⎪ m 0, ⎩ elsewhere 2 1 ∫ Median, f (x ) dx m = 1 ⇒ = 2 0 ⇒ 4 m = 2 ⎧ 2 6 , 3 ⎪ x ≤ ≤ 6 2 2 c 2 1 μ = E (X ) = ∫ xf (x)dx (x ) = ⎨ x 4 2 x = f dx ⎪ = ∫ 0, ⎩ 2 0 elsewhere 3 0 2 2 = Var ( ) X = x 2 f ( x ) − ∫ 0 2 2 1 ⎛ 3 x ∫ dx − 4 ⎜ 2 ⎝ 16 ⎞ = ⎟ 3 2 − 2 = 9 ⎠ 3 0 3 Find the series sum by of using an the innite geometric 3 a 1 + u 1 + u 2 + ... = , 0 the sum geometric formula u u Find < r 1 − of the innite series: 0.5 + 0.25 − 0.125 + … < 1 3 r 1 2 e.g. following the 2 b series  1  1  ... 2 2 9 9 4 + 3 + 2 + ... 2 27 2 + = = 2 3 2 1 3 4 Dierentiate functions, and integrate e.g. (x ) 4 Dierentiate composite 2 f composite , x 4 x ) the following 1 a ≠ f (x ) = 3 (3 integrate functions: 3 = and , 2 4 x ≠ 2 x 3 x 1 2 × ⇒ f ′( x ) ( −3) × ( −4 ) = b f (x ) = c f (x ) = e 24 = 4 (3 − 4 4 x ) (3 − π 3 4 x ) − 2x sin 2 3 2 f (x ) = 2 3 (3 d 4 x ) 2 = + 3 − 4 x ) (x ) = (x 2 − 2) 1 dx ⇒ (3 f c 2 4 (3 − 4 x ) Chapter 1 3 Probability as a tool to make informed decisions In A probability possible as the distribution outcomes statistical company of a is use mathematical par ticular likelihood might a of event each statistical or event. techniques model course For to that of action example, create shows a as the the well Probability Statistics large A scenario analysis uses probability become scenario distributions few and have ver y impor tant analyses. last decades, due to their to wide-ranging produce several theoretically distinct possibilities for the outcome of applications. a par ticular course of action or future event. For example, a business Statistics might create worst-case three scenarios: scenario probability would distribution; worst-case, contain the likely a likely , value scenario and from the would best-case. lower end contain a literacy is The of essential not business and only for the value economics towards the middle of the distribution; and the best-case scenario professionals would contain a value in the upper end of the distribution. also, for people Although it is impossible to predict the precise value of a involved level, businesses still need to be able to plan for future a scenario analysis based on a probability distribution a company frame its possible future values in terms of a use level and a worst-case and best-case scenario. By doing so, doing can base its business plans on the likely scenario but tr y aware of the alter native for Exploring further probability and predict players will possibilities. bring 4 well to still which be to players the then company statistics which likely are sales pro coaches can decide help Team events. often Using in future spor ts. sales but example, distributions the the best game. results All the probability Actuarial or science postgraduate different health care become centur y de also His have core revise tables plan was company recall Discrete The data and that Quantitative into two take put in of the can be usually random by by a this Annuities. a in student of Actuarial He modied Halley equitable Life, (1656–1742), life a well-known mathematicians in as par t fur ther, material, be it of is in 1762. the this higher impor tant par ticularly described represented and given a by if you level that feel that you chapter. as or cannot form of an obtained quantitative random continuous. nite variable variable pioneers new function topic used has 17th these was Equitable of insurance, quantities can in The group computer and who more many established Edmund a the from late used accountant. by of undergraduate in the insurance the as actuar y In better distributions core the random set, practice discrete from of an public distribution collect data one and earlier the established terminology we continuous discrete developed Life par t and an immediately as was offering as times. (1705–1757), teacher continuous values uncountable a exploring categories: exact a par t the such offer methods Working which are knowledge economics, recent science Dodson probability Before this as in book combines mathematical research elds this universities models. (1667–1754), society studied course. cannot A a in in nance, jobs Actuarial James Cumulative Y ou of advance wor ked mor tality insurance popular formalize Moivre formed assurance. 1.1 an better Dodson statistical you was many and business most study science mathematics, nance the we that Actuarial statistical distributions. to science. life like uses mathematician Abraham and of there theories British and one probability It that syllabus studies. subjects programming. models variables Discrete countable be listed or qualitative and classied random set, whilst since they variables the values come from an interval. from a nite set of values + {x , x 1 , x 2 by a , ..., x 3 n table with their x of Sometimes x , 1 is x , usually x 2 terms , 3 of ..., , that the variable usually can take given along probabilities. n when n values function p ... assign x the distribution n 4 we probability x ... p 3 a listing 4 p 2 This x 3 p  values, x 2 p has n ∈  corresponding x  }, ... a formula we , have P( X = or an x ) n r ule for innite, = p n = calculating countable p ( n ), set where p probabilities. of is values, calculated in n n. Chapter 1 5 In both 0 i ≤ cases p ≤ the following proper ties must hold: ; i ii ∑ p = . i A continuous random values, usually [a, < b ], a b, written b ∈ , a, ⎧ f formula (x ) = probability in has f a ( x ), the obtained form of probability a x ≤ from an an inter val distribution uncountable of real function set of numbers, given by the ≤ b p ⎨ 0, elsewhere ⎩ The variable density function must satisfy the following properties: f i (x) ≥ 0 for all the values of x ; +∞ f ii (x) = . ∫ −∞ A of cumulative a random variable X. function distribution variable In the (CDF) calculator with probability up core of a function is to and course discrete CDF including we dened random features the to sum the a all given the and probabilities value cumulative variable calculate of the distribution used properties of of a graphical some discrete The word means distributions. quantity In this chapter we are going to further explore some discrete by variables, addition’. and study going Even to focus though discrete and proper ties at new which on the there to of are theoretical is discrete study a continuous dierence continuous the of between variables, cumulative irrespective probability of there distribution the nature distributions. probability how are we function of the are distributions. apply also We some that the CDF common we will look variable. Denition Given the random corresponding cumulative This i ii probability probability cumulative F (x) lim variable ∈[0, ] ; F (x ) = F (x ) = function the (discrete distribution probability i.e. X F :  of F is 1] has [0, continuous) function → [0, function range or is the P : F (x ) = lim ] 0 1 x →+∞ iv F (x) x < 1 6 is x nondecreasing ⇒ F ( x 2 Exploring ) 1 further ≤ on F ( x the whole ) 2 probability distributions and → [0, P( X proper ties: x →−∞ iii  domain; i.e. to ≤ the 1], x ) the cumulative ‘increasing in successive Example Prove x a the < x 1  cumulative ⇒ P( x 2 b P( X a P( x > < < distribution X x ) X = ≤ 1 1 − x ) x ≤ 1 ) = function F (x 2 ) − has F (x 2 the following ), 1 F (x ) = P (X ≤ x 2 ) \ (X x ≤ 2 ( Set ) A is a      B A P( X x ≤ F (x ) of ) P( X x ≤ 2 − F (x 2 P( X > x ) = lim P( x X set B so the probability the di erence ≤ of the of these two sets is the probabilities. ) 1 the denition of the cumulative ) distribution 1 < of ) Use = subset 1      di erence = b proper ties: b) Rewrite the function. given set. b → + ∞ = lim P( X b) ≤ P( X x ) ≤ Use the result Use proper ty from par t a. b → + ∞ = lim F ( b ) − ( F x ) = 1 − ( F x ) iii at the bottom of page 6. b → + ∞ In the can case be tr ying have this the of a found to a nd a simple book. simply Therefore Example density = ⎨ , the b Hence x 0, ⎩ Find going Just a few formulas to function use for in a distribution the table discrete are table discrete variables beyond of function instead the values random of will scope for of nding variables. function of a random variable X is given by the formula: = 1 , 2, 3 9 ⎪ a other values + k ⎪ x ) are cumulative the  ⎧ x = most we the up formula. distribution probability P( X variable, adding generating formula; cumulative The discrete by value otherwise of determine k the cumulative distribution function. 3 a ∑ P( X = x ) = 1 ⇒ The sum of all probabilities must be equal to 1. x =1 1 + k 2 + k + 9 k 3 + k + 9 = Solve = 1 9 the equation and nd the value of k. 9  x b 6 + 3k = 1 ⇒ P( X = x )  1 = , x = 1 , 2, Use the result from part a 3 and calculate the values 9 of X = x  2 3 2 3 4 9 9 9 2 5 9 9 9 9 the probability density function f or x = 1, 2, 3. Then use the distribution P(X = denition function of and the add cumulative up all the x) previous probabilities. = 1 F (x) Chapter 1 7 Example The the X  cumulative table = distribution function of a random variable X is given by below: x  2 3 1 3 3 10 10 5 4 F (x)  Determine the formula of the probability density function. 1 P( X = 1) = ( F ) 1 Use = the result from Example 1 to nd all the 10 probabilities. 3 P( X = 2) = ( F 2 ) − ( F ) 1 1 = = 10 10 3 P( X = 3) = ( F 3 ) − F ( 2 ) 2 − 3 = 3 − 5 = 10 10 3 P( X = 4) = F ( 4 ) − F ( 3 ) = 1 − 2 = 5 ⎧ ( x ) = 5 10 , x = 1 , 2, 3, 4 Look at the repeating patter n and ⎩ rule continuous and a f or the values 1, 2, 3, 4. 0, otherwise random variables the probability density function Even f deduce ⎨ 10 ⎪ For 4 = x ⎪ f 10 the cumulative distribution function F are related as though the follows: lower boundar y of the x integral F ′ (x) = f (x) ⇔ F (x) = f ∫ is −∞, in most (t)dt of the integrals for −∞ The the following proper ties problems examples of the involving show how cumulative continuous to use this distribution random relationship function variables. to and solve the left are going left boundar y inter val x for Exploring further probability distributions to of which equal 8 boundar y to use of the f(x) zero. we the the variable is not Example A  continuous random ⎧ ax ( 2 f (x ) = − x ), x variable X has a probability density function given by ∈ [0, 2] ⎨ 0, otherwise ⎩ a Find the b Hence c Find value of a determine the modal the cumulative value of the distribution random function. variable X 2 2 2 a ∫ ax ( 2 x ) dx = 1 ⇒ a ( ∫ 2x ) x dx = 1 The denite integral on the inter val [0, 2] must 0 0 be 2 3 ⎡ a x ⎛ − = 1 ⇒ a ⎦ 2 ⎜ ⎥ 3 ⎣ to 1. − − 4 0 3 ⎝ 0 ⎞ 2 2 ⎢ equal 3 ⎤ x 2 ⎟ = 1 Solve the equation and nd the value of a. ⎠ 3 ⇒ a = 1 ⇒ a = 3 4 x b F (x) = ∫ f (t)dt Use the relationship between probability 0 density x ⎛ 3 F (x) = ∫ 0 3 2 t ⎜ t ⎝ 2 x < 0 x 2 = x ⎨ ⎪ − , 0 4 ≤ x 1 , ⎩ 3 x (x ) = 3 x − 2 3 3 x ⇒ f ′( x ) 4 − = 3 − 2 3 the cumulative the f or mula. par t a and the proper ties of = 0 ⇒ x distribution function and nd The modal value is the value f or which the x 2 probability maximum x 2 from > 2 2 f result ≤ 2 4 ⎪ c the 3 ⎪ 3 F (x ) functions. dt Use ⎪ distribution ⎠ 0, ⎧ cumulative ⎞ ⎟ 4 and density function reaches an absolute value. = 1 2 The modal value of the variable X is 1. Scratchpad y 1.0 3 3 2 f1(x x 2 x 4 0.75 (1, 0.75) 0.5 0.25 x 0 0.5 1 1.5 2 2.5 Chapter 1 9 Example A  continuous random 0, ⎧ ⎪ F (x ) = ⎨ variable 4 3 x x X has a cumulative distribution function, F (x), given by: < 0 2 − 4x 4 x + , 0 x ≤ ≤ b ⎪ 1 , ⎩ a Find the b Hence c Find a b value determine the 4 median 3 − of x > b b the probability value of the density random function. variable X 2 4b + 4b = The 1 are 2 ⇒ (b ⇒ (b cumulative no points of function is monotone discontinuity and there F ( b) = theref ore 1. 2 − 2b ) − 1 = 2 0 Solve 2 − 2b − 1) (b − 2b + 1) = the equation and eliminate impossible 0 solutions: is b eliminated because it is negative 1 b = 1 − 2 , b 1 = 1 + 2 , b 2 = 3 1 and b is eliminated because the CDF is not 2 4 monotone on the on the inter val [0 , 1 + 2 as ] seen screen. Scratchpad y 2 4 3 –4·x f1(x)=x 2 +4·x 1.5 (2.41, 0.971) (1, 1) A monotone function is entirely 1 nondecreasing or nonincreasing on the whole 0.5 domain. x 0 0.5 b f (x ) = f (x ) = 1 1.5 2 2.5 F ′( x ) Use the density 3 ⎧4 x relationship and the between cumulative the probability distribution 2 − 12 x + 8x , x ∈ [0, 1] functions. ⎨ Di erentiate the probability density 0, otherwise ⎩ function The c from graph of par t the a cumulative distribution Scratchpad function of a random variable X is equivalent y to 2 0, f1(x) 4 = { x 3 –4·x +4·x x < 0 x ≤ 1 of 2 , 0 1, ≤ x > the cumulative data. Notice (Ogive). To 1 frequenc y that both calculate the diagram have the median of a same value set shape we 1 1 have (0.459, 0.5) to solve the equation F (m ) ,  2 1 f2(x) can = 2 x 0 0.5 Median, 10 Exploring m = further 1 0.459. probability distributions be solved on the GDC. which Exercise 1 The by 1A probability the density x ) = = , a Find b Hence by the value of = = x ) a Find b Hence c Calculate X , k the the value by of determine P (X the x ≤  1 1 6 2 What is the the 2, 3, by the cumulative x formula of value the of function. a random variable X is the density function. variable X ? of a random variable X is below: 7 1 4 9 16 25 25 25 25 9  is function the the continuous formula median random given ⎪ = of value the of variable probability the X density function. variable X ? has a probability density by ⎧ ) of probability function 5 What x given 2 3 b ( distribution function  Determine f is 2). distribution table a A variable X 4 F (x) 5 random a the modal cumulative = a function.  b X of distribution below: 0 Determine given function distribution table a The cumulative , = 1 x F (x) 4 2 0, otherwise cumulative = 0, 1 , 20 ⎨ ⎩ given given 2x ⎪ The = density ⎪ 3 is formula: ⎧ a P( X x determine probability the variable X 0, otherwise ⎩ The random 12 ⎨ ⎪ 2 a + k ⎪ (X of formula: ⎧ x P function π ⎡ 2 sin bx , x ∈ ⎨ ⎤ 0, ⎢ ⎥ ⎣ 3 ⎦ ⎪ 0, otherwise ⎩ a Find the b Hence value of determine Calculate P ⎜ ⎝ the cumulative distribution function. π ⎛ c b X ≥ ⎟ 6 Chapter 1 11 6 A continuous random variable X has a probability density function For given by the more see on 1 ⎧ , ⎪ ( f x − 2 x < < = ⎨π 4 exercise pages and exercise pages 0, ⎩ 530–531 b Find 1.2 what Ber noulli arises each you a this There don’t is a get of game game many a 12 to Y ou an the also of n but of dened. variable X level has lear ned and core two that a of negative are distribution experiments success. As you with will distributions experiments of about outcomes: binomial binomial independent sequences course possible independent probability such these higher trial such equal geometric sequences popular is dich innite with length. board game in Europe called ‘Ludo’ or ‘People, chooses is played elds ‘6’ on this in many ˇ C lověče as a is to by different nezlob the play each given die. question game further with se, on we is: a die. are languages ˇ Covječe within ne (Ludo, ljuti Mensch Лyдo, se, of one of moving the gures To the game star t three dicult ask: the gures given ‘How must probability the player Players question, star t Exploring available nicht, He ce човечe...) everlasting answer the in Ber noulli sequence having random well distributions studied A the is distribution player obtain of function angr y’. cъpди able nite ver y ärgere Each value failure (F). probability , The The have section consist Geometric as modal experiment constant probability probability or from in The this experiments. success (S) also the Other Recall see that on of otherwise course Show 10K x ⎪ a 10C 502–503, 2 2 ) practice, formula ‘What rst is is three distributions to the of to star t four his each attempts it the the of by must game. game?’ probability attempts?’ colour player star t the colours. To being companion. the Let’s look at the possible sequence of Ber noulli trials. We will Jacob denote the outcomes of each single trial by S or F (S denotes Bernoulli ‘rolling (1654−1705), a 6’, and F denotes ‘not rolling a 6’). A geometric of denotes the innite sequence of such experiments until we reach eight success, Again (i.e. we each e.g. need roll probabilities probability to of of of S, FS, FFFS, emphasise the the a FFS, die) is is each mutually possible success that FFFFS, FFFFFS...... these are by p always and the and the that same. probability 17th the He do The of of is q = 1 − p, 0 p a ≤ and was an of family 18th the the on in of the centuries. rst extensive work like failure Swiss mathematicians experiments independent outcomes denoted of members the famous rst one distribution one to piece problems this. 1 ≤ Denition A discrete random variable X is said to have a geometric k distribution 0 ≤ p This and ≤ 1 , q we = 1 − denition write p, k q ∈ [0, 1] single ⇒ ∼ Geo ( p ) if P( X = k ) = the proper ties of a probability 1 q probability = is P( X within = k ) ∈ [0, 1] the inter val for [0, all k ∈  = k) = + k p = p means Together, i and distribution Example Given ∑ + is that ii a distribution. - ever y 1 1 q = p = 1 + k ∈ This p q = 1. p k∈  the show good sum that of the all probabilities above is denition 1. for a geometric one.  that X p = 0.4, k b p = 0.9, k c p = 0.3, a 1 q ∑ k ∈ where 1]. 1 P( X ∑ p, + p k ii 1 q = 1 , 2, 3, 4... satises k p, i X Geo( p ) , ∼ = 2 = 6 k = nd P (X = k) if: 3 a P( X = 2) = 0 .6 b P( X = 6) = 0 .1 c P( X = × 0 .4 = 0 .24 Calculate of a q = geometric 1 − p and then use the denition distribution. 5 × 0 .9 = 0 .000009 12 13 ) = 0 .87 × 0 .13 = 0 .0244 Chapter 1 13 We can calculate those probabilities by using the GDC. Scratchpad geomPdf(0.4, 2) 0.24 geomPdf(0.9, 6) 0.000009 geomPdf(0.13, 13) 0.024444 3/99 Example Find or X the that ∼  probability the Geo player ⎛ 1 ⎞ ⎜ ⎟ ⎝ Method P(X = that will a player obtain a will “6” on successfully an star t unbiased die the game within the “Ludo” rst three attempts. 1 Probability of scoring a “6” is p = . 6 6 ⎠ I ) + P(X = 2) + P(X = 3) = Calculate q 1 − p and then denition of a geometric f or of k = use the distribution 2 5 1 = + 1 × ⎛ 5 ⎞ + × ⎜ 6 6 Method 6 1 ⎝ 1 = 5 + 25 + 91 = = 0 values ⎠ 6 6 36 216 216 all the on 1 0.421296 , 1, 3 6 ) 1/99 P( X 14 ≤ 3 and add up II Use ( 2, probabilities. a cumulative Scratchpad geomCdf 1, 421 ⎟ 6 3) Exploring = 0 .421 further probability distributions the GDC. distribution function Example The a probability biased X 8 ∼ die is of star ting 0.28. the game Calculate the “Ludo” on probability the of Geo( p ) = 3) = scoring Use 2 P( X third q f or the attempt a “6” on denition value of k = using 3 the of a and die. geometric write the distribution equation in p one variable p. 2 (1 − p) p = 0.128 Use Scratchpad the GDC eliminate y the to solve the impossible equation solution, and 0 ≤ p ≤ 1 2 2 x–0.128 f1(x)=(1–x) 1 (0.2, 0) (0.488, 0) x –0.5 1 .5 1.5 –1 –2 p = 0.2, p 1 Notice On the with p must or we can estimate we For second and  the 1.313 given use numerical process the = data we have two solutions. GDC functions 0.4 0.488, 3 that possible we = 2 can the input and two features solver. values those third When of for solving multiple values solution to we solving by equations: the solutions obtain input the the numerical so that desirable values in solver the iteration solution. of respectively . Scratchpad 2 nSolve((1–p) p=0.128, p) p=0.128, p, 0.4) 0.2 2 nSolve((1–p) 0.487689 2 nSolve((1–p) p=0.128, p, 1) 1.31231 3/99 It is much simpler to spot multiple solutions when using the function feature. Chapter 1 15 Example Find the the number game Method 9 of “Ludo” attempts is greater the such that the probability chance of not of successfully star ting the star ting game. I ⎛ 1 ⎞ ∼ X needed than Geo ⎝ P( X n) ≤ ⎟ 6 Probability 1 ⇒ ⎜ of scoring a > “6” within rst n 1 attempts 2 ⎠ should exceed . Find the probabilities 2 n 1 5 1 + × 6 + ... + 6 6 ⎛ 5 ⎞ ⎜ ⎟ ⎝ 6 ⎛ 5 ⎜1 6 + ⎜ + ... + 6 ⎛ 5 ⎞ ⎜ ⎟ 6 ⎝ 1 f or 1 × > values of k = 1, 2 , ..., n and add them up. ⇒ 6 ⎠ n 1 1 2 1 ⎞ 1 Notice ⎟ > geometric sequence and apply the 2 ⎠ ⎝ the ⇒ ⎟ f or mula ⎠ f or the sum. n 5 ⎛ 1 ⎞ ⎜ 1 ⎟ 6 ⎝ Simplif y > inequality and use logarithms to ⇒ solve 5 6 the 1 ⎠ × the inequality. 2 1 6 n 1 > 2 ⎛ 5 ⎞ ⎜ ⎟ ⎝ 6 ⎛ 1 ⎞ ⇒ log ⎠ ⎜ ⎝ ⎟ 2 ⎛ > n × log ⎠ 5 Logarithms ⎝ ⎛ Method f orget to the inequality symbol. 3 80 ⇒ n = 4 5 ⎞ ⎜ ⎝ don’t ⎠ = ⎛ so ⎟ 2 > log negative 1 ⎞ ⎜ ⎝ n are 6 reverse log here ⎜ ⎟ 6 ⎠ II Use the cumulative distribution function on the Scratchpad GDC. y 2 step 1 f1(x)=geomCdf ( 6 ) tracing 2 (4, 0.018) x 0 2 3 4 5 6 –1 f1: (4, 0.018) –2 P( X 16 ≤ n) Exploring > 0 .5 ⇒ further n = function the values and 1 , 1, x 1 1 Since 4 probability distributions the graph. we are can discrete nd the we obtain solution by a Exercise 1 2 3 Given 1B that X a p = 0.6, b p = 0.14, c p = 0.5, d p = 0.88, Given that a P (X ≤ b P (X > c P (5 d P (1 Mario k k = k 4) 6) p = 0.3 < X ≤ 7) if p = 0.991 cer tain emergency random. the at a balloon independent each that fth shot Mario school is and is the selected of with the 0.73. will 92% procedure. What following: 0.7 if with a = shooting the 0.25 7) balloon a p = nd ≤ is In p Geo( p ) X attempt if: 5 ≤ is k) 4 = if = 3 ∼ if P(X 2 = X probability 4 = k nd ∼ Geo( p ) Mario Students and arrow . that three Each Mario hits arrows. the Find the balloon. are from probability has the students is bow probability destroy student a familiar the with school are the re selected at that: the rst one who doesn’t know the procedure; b the rst not 5 Fred that a spoon What the b will ever y defective, he is rst What 4th be who selected wooden adjusts the the manufactured the the procedure spoons. with spoon no and The defect when probability is one 0.85. is Given that X ∼ Fred found that the four th that within manufactured to be spoon be no need for the manufacture of is Geo( p ) , show that six adjustment? k 6 will one? probability will know machine. probability there doesn’t student? souvenir manufactured defective is spoons selected before manufactures inspects a student occur P( X > k ) = q + , k ∈  . Chapter 1 17 Investigation Let X ∼ Geo( p ) . Calculate the a p = 0.4, P( X > 5 |X > 3), b p = 0.7, P( X > 6 |X > 2 ), c p = 0.12, Make and a tr y this to need for prove to the of a about we will n P( X 5), the lear n > 4) P( X form > of of how to a 7) nd of the conditional random expected In order sequences. a the and simple conjecture. geometric variable. terms between your geometric consecutive probabilities: 2) connection random innite > P( X variance geometric of > general and manipulate sum 12 | X the value section variance > conjecture Expected In P( X following to value do Recall geometric variable so and you the the will formula sequence: n +1 2 1 + q + 3 q + 1 n q + ... + q q = , 1 When n becomes between − Therefore and we an , can extremely the where |q| < . q higher large number powers of q and q will be is a ver y number close to zero. write n +1 1 2 1 + q + q 3 + q 2 + ... = lim (1 + q + q 3 + + ... + q ) = n → ∞ So in other innite words you geometric obtained 1 lim 0 = n →∞ have q 1 n q the result for 1 the q sum = 1 − q of an sequence. 1 2 1 + q + q 3 + q + ... = , 1 Now will if we obtain dierentiate the q ∈ ]−1, 1[ q this sequence successively , term by term, we following: 1 2 1 + q + q 3 + q + ... = , 1 q ∈ ]−1, 1[ q 1 2 ⇒ 1 + 2q + 3q 3 + 4q + ... = , q ∈ ]−1, 1[ 2 (1 If we dierentiate again we q ) see that: 2 2 ⇒ 2 + 3 × 2q + 4 × 3q 3 + 5 × 4q + ... = , q ∈ ]−1, 1[ 3 (1 − We a 18 will use geometric Exploring these results random further to nd variable, probability as the q ) expected shown distributions in value the and following variance of example. 1 − q probability , Example Find E ( X the ) 0 expected = k ∑ value × P( X = and the variance of a geometric random variable X k) Use the denition of + k ∈ expected 2 = 1 × p + 2 × qp + + 3q 3 × 3 q p + 4 × k q p + ... + k × value. 1 q p Use + ... the distributive proper ty. 2 = p (1 + 2q 3 + k 4q + ... + 1 kq + ...) Use 1 = 1 p = 1 p the the result rst obtained derivative = 2 (1 in 2 q ) p of p the and geometric simplif y series the expression. 2 2 Var ( X ) = 2 E (X ) − 2 (E ( X )) = k ∑ 1 ⎛ × P( X = k ) ⎞ − Use ⎜ + the denition of ⎟ p ⎝ k ∈ ⎠ variance. 2 E ( 2 ) X = 2 1 × p + 2 2 × qp + 2 3 × 2 q p + 3 4 × q p + To simplif y we nd the process 2 2 ... + k k × 2 1 q p p (1 + = p (( 2 2 2 × q + 2 3 × 2 q + 3 4 × 2 q + ... + k k × × 1 + + ...) (3 − 1) × 2q + (4 − 1) × 3q 3 + (5 − 1) × 4q + Use the k ... + (( k + 1) rst. 1 q 2 − 1) ) + ... 2 = E (X − 1) × the f or mula f or all ter ms. 1 kq + ...) 2 n 2 = p (( 2 + 3 × 2q + 4 × 3 3q + 5 2 (1 − + 2 × q + 3 q × × 4q k + ... + (k 3 4 + × + 1) k k q + ... + k 1 q × × q = + ...) Rewrite two by = 1 p ⎞ − ⎜ ⎝ = 3 (1 q ) ⎛ 2 (1 q ) 2 1 p − ⎜ ⎟ ⎝ ⎠ 2 p 2 − p 1 using innite × n the sums above. ⎟ Use the results rst and of the 2 p ⎠ the derivative 2 1) p = 3 p 2 ⎞ 1) + ...)) obtained 2 ⎛ + (( n 1 1 − p second to simplif y. q 2 Var ( X ) = E ( X ) ( )) − ( E X = − = 2 Use = 2 p 2 p p the result to nd 2 p the variance and simplif y. Given a geometric random variable, 1 then E ( X ) = if X ∼ Geo( p ), q and Var ( X ) . = 2 p This will be used p in Example  on the next page. Chapter 1 19 Example Find the  expected determine X ∼ Geo the ⎛ 1 ⎞ ⎜ ⎟ ⎝ 6 ⎠ number maximum of attempts number of to star t attempts the it game will “Ludo”. take to star t Use the the empirical r ule to game. 1 ⇒ E ( ) X = = Apply 6 the f or mula f or the expected the f or mula f or the variance value given 1 above. 6 5 ( Var X 6 ) = = ⇒ σ 30 = 30 = 5 48 Apply 2 1 ⎛ and ⎞ ⎜ calculate ⎟ the standard deviation. ⎝ 36 ⎠ [6 − 3 × 5.48, 6 + 3 × 5.48] = [ −10.4 , 22.4 ] The empirical whole population deviations from maximum game Exercise 1 Find the Mario expected is is balloon with What Use is the Mario 3 In a selected will the be Negative The random familiar 20 of 23. variance 1B, the questions 1 with the for a bow and and probability geometric 2 arrow . that Each Mario hits the number of shots Mario must make to parameter . distribution, to The to further nd destroy of r ule, the the that maximum are school determine that election number of shots balloon. students ensure the distribution after explore Blaise the negative named values to familiar are how one of with selected many the at the election random. students selected must students procedure. distribution named Pólya’ s Exploring number to after the set probability is Pascal negative binomial George of also real known Pascal’ s (1623–1662). binomial distribution Pólya Pascal distribution is was using sometimes (1887–1985) numbers. distributions as who the an called extended be f all mean 0.73. from binomial mathematician to 54% with binomial parameter is r ule make Students integer the and expected school negative the balloon shot empirical at and Exercise a empirical distribution, rst will the balloon? must procedure. the at each the cer tain Using in independent destroy b value given shooting attempt a is (99.73%) 1C distributions 2 rule states within value, attempts 3 that the standard theref ore to star t the the Let’s stop SS, In again not FSS, the more consider after SFS, table than FFSS, one sequence success, FSFS, you Ber noulli when SFFS, can see permutation that in distribution for order Two FFS FFSS, FFFS FFFSS, number Let’s of consider three three to a have successes we sequence compare Geometric minimum successes successes below the the number within the of Three FSSS, FFS FFSSS, might In notice general, trials − we  last if , by and the in the we of the rst k-th at the probability ⎛ k = k) = r 1 In order = r, This r to + have , r discrete + number a r 2, rst par t to of with have sequence and the the one p =  k −1 + stop when use the table available for SFSFS, successes We we of k to sequence required, to obtain obtain can parameters k more SSFFS Ber noulli need need one be − success.  So and we can 1⎞ r k × k − th trial r − th ⎜ ⎝ success ⎟ r 1 q r p ⎠ trials successes r of r −1 p within is two. SFFSS, events. experiments need ⎠ successes now function.     r −1 the ⎛ k q ⎟ is we’ll FSSFS, 1⎞ ⎜ ⎝ we we perform permutations FSFSS, distribution density k −r P( X simpler, independent The is SSFS, successes  there SFFFS must where in − trials, case to r trial. end this available. two SFFFS,....... distribution. SFSS, increase k binomial in we permutations up consider in the then as record actually success nd that number we successes described − the e.g. successes FS increase it dierent SSS we will successes: FSFFS, trials trials make S We of which geometric distribution two of we occurred, SFFS FFSFS, Ber noulli To time FSFFS, number achieve FSFS, number occurred. FFSFS, this have SFS require, of but successes FSS, that trials successes each to FS also two FFFSS, SS the k of but S Notice r a rst below , Geometric r the we must have at least r experiments, therefore 3, ... distribution is called negative binomial distribution. Denition A discrete random variable X is said to have negative binomial ⎛ k 1⎞ k distribution and we write X ∼ NB ( r , p ) if P (X = k ) = ⎜ ⎝ where 0 ≤ p ≤ 1 , q = 1 − p, k = r , r + 1 , r + 2, r ⎟ r 1 q r r p , ⎠ + 3, ... Chapter 1 21 It is left as an probability exercise to distribution. show that Therefore ⎛ k this we = k ) ∈ [0, 1] ⇒ 0 ≤ for k all = r , r + 1 , r + ∞ ⎟ r 2, 1 r P( X = k ) = 1 k =r Geometric distribution distribution Example where we that: r p ≤ 1 1⎞ that X ∼ is = 2, p = 0., k = 2 b r = 6, p = 0.5, k = 9 c r = 2, p = 0.25, ⎛ 2 k p) = 2) = ⎟ 2 ⎛ 9 fact 9) = 1 ⎟ 6 50 ) = Unfor tunately , distribution just a special one case of success, r a = negative 1, Geo(p) binomial = NB(1, p). 1 k) 0.1 = Use 0.01 6 the denition and 6 × = 0.5 0.109 1⎞ ⎟ 12 1 the 12 12 0.75 × 0.25 = 0.0310 ⎠ display calculators distribution helps us have features, to write but down no the a negative binomial programming useful program lines. nbpdf 1.1 if: ⎠ calculator few = 2 × 0.5 graphic under a P (X distribution ⎜ ⎝ of a 2 50 capacity 1 ⎠ ⎛ 50 = = 1⎞ ⎜ ⎝ P( X r p ⎠ just nd 0.9 9 = r 1⎞ ⎜ ⎝ P( X in 1 50 2 = ⎝ require NB( r , r P( X k ⎟ r  a with r q q k =r c a 3, ... ⇒ ⎜ b prove of ⎠ + ⎛ k a to denition ∞ ii Given good 1⎞ ⎜ ⎝ a need k P( X i is *nbpdf 1.1 1/1 *nbpdf 11 *nbpdf nbpdf(6, 0.5, 9) Dene Dene nbpdf(r, p, k)= Probbii = nbpdf(r, p, k 0.1095 Prgm Prgm Done k Disp “Prob.=”, nCr(k–1, r–1)·(1–p) r r Disp “Prob.=”, nCr p nbpdf(1, 0.5, 50) EndPrgm EndPrgm Probbii = 0.010 Done 9 99 A similar function 22 programme of Exploring a negative further can be created binomially probability for the cumulative distributed distributions variable. distribution of a apply negative the binomial f or mula. Example A new agree 8 a dr ug to 2 c not ∼ is take people b X  to par t will people more be tested in be will the 5 people. study , asked be than on what before asked 3 before people are If is are 5 the who found before of the 2 who asked = 3, p = 0.12 ⇒ P( X = 8) 5 = to agree found par ticipate; to par ticipate; who 0.12 agree Identif y ⎟ 2 ⎝ = 5, p = 0.12 ⇒ P( X = 12 ) = ⎝ = 7 0.88 ⎜ r the 2, = P( X p = 0.12 = 2) to = ⇒ P(2 X ≤ ≤ 1 ⎝ + P( X 2 + = 3) 4 + P( X ⎜ ⎠ ⎝ 4) ⎝ the calculate Identif y the the probabilities. + P( X = 1 × 0.12 + ⎠ ⎝ up all the of the variable probabilities f or 5) 2 0.88 ⎜ add values identied values. 2 × 0.12 + ⎟ 1 ⎠ 2 × 0.12 ⎟ 1 ⎠ 0.2 The binomial binomial generates to r, varies Given of and the r a to xed number negative generates from binomial takes distributions number of in of order distribution to that we achieve are similar : Bernoulli successes, r, binomial probabilities innity) n as r perform r and varies takes these the trials a xed number n Bernoulli trials successes. 1D X that a r = 1, b r = 3, p c r = 7, p d r = 23, Given negative Conversely, successes, n and distribution probabilities n. Exercise 2 use 5 ⎛ 3 ⎞ ⎟ 3 0.88 ⎜ 1 and ⎠ = 2 0.88 ⎟ ⎛ 4 ⎞ (where negative 0.12 5) ⎛ 2 ⎞ 0.12 ⎜ of a ⎟ the ⎛ 1⎞ 0 of 0.00336 = = parameters distribution f or mula and from par ticipate? 0.092 ⎛ 11 ⎞ c to ⎠ binomial = r par ticipate 3 0.88 ⎜ b to that: agree are people NB ( r , p ) ⎛ 7 ⎞ r a 2% probability found are asked only p = P(X ≤ b P(X > c P(5 ≤ d P(8 < NB( r , p ) 0.2, k = 0.5, k = 4 = 0.8, k = 9 p = X 4) X X ∼ if 6) if ≤ ≤ 0.77, = p 7) k = = if 11) if 0.25 0.5 p = p P (X = k) if: 32 NB( r , p ) p nd 2 = that a ∼ nd: and and 0.82 = r r = = 2 and 0.43 3 r and = r 4 = 5 Chapter 1 23 3 The random variable has the following 1 Given a p that Hence 4 Before when 1, 2, and P (3 Alemka rolling 3 times nd and a 4. fair Let Alemka P( X X ≤ p). = 3) = , nd the value of p 5). star t a game, tetrahedral random has NB (2, 125 ≤ can : 24 < 2 b distribution X to she die with variable throw the has X die to the obtain standard denote until two the she “ones” faces total obtains number the of second “one”. a Write the distribution of X, including the value(s) of each parameter. 3 b Find the value of x such that P( X = x ) = . 32 c 5 Calculate Nicholas he is needs least to eat probability 6 he ve b at seven cer tain emergency will the b he An he need order eats to to an before pass phone. to apple the is advance he before 92% next 0.85. in In the this game level. He The needs game. to What eat is the of advances; he are advances? students Students who smar t eat inspector to are from conducts familiar familiar that an with school are inter view with the the re selected and needs procedure. at at What is inter view exactly six students in order to satisfy will not need in nds factor y shape three to that is inter view produces an 0.85. more a special ice-cream Tony ice-creams to will inspects have than a a dozen type be a diet produced ever y shape of students? with ice-cream defect he ice-cream. no and adjusts when the machine. a What is the ice-creams b What there 24 the need; probability defect will his that: ice-cream The re on order procedure. students he in school The a in apples least ve game apples random. probability 7 that exactly a a Nicholas apples a In 5 ). apples that four least ≤ playing probability at P ( X Exploring is before the will probability adjusts probability be further he no need probability that that for he will the inspect ve machine? within half adjustment? distributions exactly a dozen ice-creams 1.3 Probability When tossing two generating coins the functions number of heads obtained is When recorded. Let X be the discrete random variable. The it below shows the probability distribution function of = x 0  2 1 1 1 4 2 4 Probability is impor tant not to rely X too much on the sharpest D’Alember t X studying table intuition. minds Even like Jean (1717–1783) have i at P (X = some stage mistakes. In made his famous ar ticle Croix x ) i ou Pile in the French Encyclopédie Y ou may notice that, in this table, the the value of probabilities sequentially as X changes. From following we can write the probabilities in the form polynomial expression involving a of a fair coin, in two what is of the a problem: this tosses table considered are the arranged he variable, t: probability appear at that least Heads will once? 2 D’Alember t’ s 1 G (t ) 1 0 × = t × 4 1 1 + t × 2 1 2 + t 1 = + 4 1 t 4 + 2 expression is called a He 4 probability reasoned generating one this discrete random variable. We can that G ( ) 1 1 = + 4 immediately 2 are all the the 1 since the corresponding probabilities Example A pair Write of the up after on Heads the rst toss. In coecients of other this words, D’Alember t probabilities and the must be equal to assumed that the sum space was {H,TH,TT}. .  dice are rolled. probability probability Let X denote distribution generating the function sum of of the the outcomes random on the variable X 2 3 4 upper and nd faces. the function. Look x life the 4 sample of real notice mistakenly polynomial in continue 1 + = . function showed 1 that would experiment for was 3 no This answer 2 t 5 6 7 8 9 0  at all the possible pairs of 2 outcomes i the 1 2 3 4 5 6 5 4 3 2 1 36 36 36 36 36 36 36 36 36 36 36 and their cor responding sums. Find probabilities. p Take i p to be the coecients of i the polynomial and x to be the i 1 1 1 2 G (t ) = t 3 + t 36 1 1 A as notice 1 random in the G () variable case of a can t 7 + 36 t powers. 8 + 6 t 36 1 t 18 = 5 6 + 11 12 that t 9 + 1 5 + 10 t + 9 Again t 12 9 t 5 4 + 18 + 1 12 t + 36 . also assume geometric values random in an variable, as innite shown set in of numbers Example 5. Chapter 1 25 Example A coin ips. and is  ipped Write nd the the until a head probability probability is obtained. distribution generating The of the random random  2 3 4 k ... denotes describing the this number of experiment function. The x variable X variable possible outcomes are H, TH, TTH, ... 1 i TTTH,... and so on. X ∼ Geo 2 1 1 1 1 1 p ... ... k i 2 4 1 8 1 16 1 1 2 G (t ) = t 2 + 1 3 t + 4 t + k t + ... + t + ... G(t) is an innite geometric series with k 2 4 8 16 2 1 common ratio t . 1 2 t t 2 = = , 1 1 2 t ≠ 2 Use the f or mula f or the sum of an geometric series and simplif y the 2 u 1 Notice that we were using the = formula S for the sum of an ∞ 1 innite geometric series, under the r condition that this series converges. 1 That is, we must have − 1 < t < 1 ⇒ − 2 < t 2. < For the other values of t, 2 this sum is not dened. 1 Again we also notice that G (1) = = 2 1 1 Denition Let P ( X X = be k ) a discrete = p , k = random variable are 0, 1 , 2, 3... the assuming nonnegative corresponding integer probabilities. values Then a and function k ∞ k G : R → R of the form G ( t ) = p ∑ 2 t = p k + p 0 t + p 1 t 3 + p 2 t n + . . . + p 3 t + . . . is n k =0 called As a seen only probability in the within a nite of Let’s nd tr y to distributions Ber noulli generating previous probability the set, we variable we are distribution function examples, could a discrete any : random as an outside functions familiar B (, this value generating already X if consider taking probability that for = 0) Binomial = q ) and (P( X distribution X = : 1) = some ⇒ G (t ) = qt pt = p): n −k = k ) = ⎜ ⎝ ⎟ k q k p , k = 0, 1 , 2, n 26 Exploring ⇒ G (t ) ⎛ n ⎞ n −k ∑⎜ k =0 n n ⎛ n ⎞ = 3, ..., ⎠ ⎝ ⎟ k further q k p k t ⎠ probability n −k = ∑⎜ k =1 ⎝ ⎟ k q ⎠ distributions k ( pt ) n = (q + pt ) q + pt values where is discrete 1 + takes set with. ⎛ n ⎞ P( X set nite p): p) B (n, variable innite the 0 (P( X innite t t the zero. expression. Poisson distribution k m X : Po (m): −m k ∞ e −m m e k P( X = k ) = , k = 0, 1 , 2, 3, ... ⇒ ( G ) t = t ∑ k ! k ! k =0 k ∞ ( * ) mt −m = e −m mt e = ∑ m e × = ( ) t −1 e k ! k =0 *We have used Maclaurin’ s Geometric distribution X : for ∞ = k ) = k −1 q p, k = 1 , 2, 3, ... ⇒ G (t ) = the exponential ∞ k −1 P( X formula Geo(p): ∑ q k −1 = pt ∑ q x x k −1 function t e = ∑ k k =1 k –  = i 0 k ! k =1 This Substitute k ∞ k pt formula is a par t then, of ∞ the Calculus option. ∞ pt k −1 pt ∑ q k −1 i t = pt k =1 When − common < qt < ratio Negative , this is an ∑ i innite q i t = 1 =0 geometric series qt with the qt binomial distribution X : NB (r, p): ∞ ⎛ k − 1⎞ ⎛ k k −r P( X = k ) = ⎜ ⎟ − 1 r ⎝ p k −r , k = r , r + p 1⎞ ⎛ r k −r ∑⎜ k = r Using the + 2, ... ⇒ G (t ) = ∑⎜ k =r r t r ⎝ ⎟ r − 1 q r p k t ⎠ ∞ ⎛ k = 1 , ⎠ ∞ r G(t) − 1⎞ r q ⎟ r ⎝ 1 q k −r r = t p i i =0 k – r = i − 1⎞ ∑⎜ ⎠ substitution + i r t and ⎟ i ⎝ the q i t ⎠ formula for negative ∞ ⎛ n + i − 1⎞ n binomial series (1 − i x ) = ∑⎜ i =0 ⎟ i ⎝ x , where − < qt < . ⎠ r r ∞ ⎛ r r p + i − 1⎞ p r t i ∑ ⎜ i =0 ⎝ ⎟ i q pt ⎛ = ⎞ = ⎜ r (1 − ⎠ Example r t i t qt ) ⎝ ⎟ 1 − qt ⎠  a A random variable X has a probability generating function G (t ) = , 4 a Find the b Hence value of calculate t ≠ 4 t a P ( ≤ X ≤ 3). a Use = 1 ⇒ a 4 a = 3 b the fact that G(1) = 1. 3 1 G (t ) 3 = 1 = 4 × t t 4 Rewrite it in a polynomial Add coecients f or m. 1 4 3 = 1 ⎛ × 4 ⎜ 1 + t ≤ X ≤ 3) + 4 ⎝ 3 P(1 1 2 1 = 4 1 ⎝ 4 t + ... 64 1 + ⎜ 3 + 16 ⎛ × t 1 + 16 ⎟ 64 63 ⎞ ⎠ = the of the powers 1, 2, and 3. 256 Chapter 1 27 Now let’s take successive a look at the probability generating function and its derivatives. ∞ k G (t ) = p ∑ 2 t = p k + p 0 t + p 1 3 t + p 2 n t + ... + p 3 t + ... n k =0 ∞ 2 G ′( t ) = 0 + p + 2 p 1 t + 3 p 2 3 t + 4 p 3 n −1 t + ... + np 4 k −1 t + ... = ∑ n k × p t k k =1 ∞ 2 G ′′( t ) = 0 + 0 + 2 p + 3 × 2 p 2 t + 4 × 3 p 3 n −2 t + ... + n ( n −1 ) p k −2 t + ... = n 4 ∑ k (k − 1) × p t k k =2 By considering when t = , the we derivatives can derive ∞ of some the very probability useful generating results for the function value of t = : ∞ k G i ( ) 1 = p ∑ 1 = ∑ k k =0 p ∞ ( ) 1 = 1 ∞ k G′ ii = k k =0 ∑ k p 1 1 = ∑ k k =1 k p = ∞ G ″ ( ) 1 = k (k ∑ − 1) p you from also can the = ∑ k ) we can derivative calculate of k (k − 1) p = calculate of a the expected probability the variance the probability ∞ Var(X ) of a = random generating ∞ ∑ k (k − 1) p = k k =2 value generating − 1)) ∑ directly function. variable ∞ (k E(X ) We can using function. ∞ 2 G ″(1) E( X ( X k k =2 see, rst derivatives X 2 1 k =2 As ( ∞ k iii E k k =1 ∞ ∞ 2 − k ) p = k k =2 ∑ k 2 p − k k =2 ∑ kp + p k − p 1 = 1 ∑ k =2 k p k k =1 − ∑ k =1 2 = E(X ) − E(X ) 2 G ″ ( ) 1 = E ( X 2 ) − E (X ) ⇒ E ( X ) = G″ ( ) 1 + G′ ( ) 1  G ( ) 1 2 Var ( X ) Let’s G (1) = = 2 ( E ( X )) the = results G ″ (1) we just + G′ ( 1) − (G ′ ( 1)) = G ″( 1) + G′ ( 1)( 1 obtained: = G ′(1) = G ″(1) + G ′(1) Knowing these easily (1 − G ′(1)) formulas calculate previously binomial 28 − 1 Var ( X ) that 2 ) summarize E( X ) more E( X were and some generating of dicult the to expected nd. We distribution. Exploring further probability functions distributions values are going we can and to much variances star t with the − G ′(1)) k p k Example Use the  results variance of a of the probability Binomial random generating variable function with to nd parameters n the and expected value and the p n X ∼ B( n , p ) ⇒ G (t ) = (q + Use pt ) the probability f or mula n −1 ⇒ G ′( t ) = n (q + × p = np ( q + = np ( n − 1) n( n − (q 1) p + pt ) × n (q + ( X ) = G′ ( ) 1 = np q + the rst and second derivatives. p 2 1 = p × 1    ( to pt ) n E twice 2 2 = it pt ) obtain G ″( t ) di erentiate function n −1 pt ) n ⇒ and generating np Use the f or mula f or expected Use the f or mula f or variance value. ) 1 Var ( X ) = G ″ (1) + G ′(1) (1 − G ′(1)) previous n 2 = n (n − 1) p q ( + p the results. 2 × 1      and + np (1 − np ) ) 1 2 = It is much Poisson 2 2 p − easier the to + nd as np the − n 2 p = expected Example 8 np (1 − value p) and = Simplif y npq the variance of the result. a shows. 8 probability Poisson 2 np distribution, Example Use n random generating variable with function to nd the expected value and the variance of a parameter m. m ( t −1) X ∼ Po ( m ) ⇒ G (t ) = e m ( t −1) ⇒ G ′( t ) = e Use the and di erentiate m = = me G ( ) 1 2 × m E ( X ) Var ( X = ) = = G ″(1) generating twice to function obtain the f or mula rst and me second G ″( t ) it m ( t −1) × m ( t −1) ⇒ probability m = = G′ e ( ) 1−1 me + m derivatives. m ( t −1) ( ) 1 (1 m − Use the f or mula f or expected Use the f or mula f or variance value. and the previous G ′(1)) results. 2 = m = m m ( ) 1−1 m e + 2 There names are still and functions illustrate for ( ) 1−1 me ( ) m 1−1 (1 − me ) 2 + m − many many make their m = other of m Simplif y distributions those that distributions, calculations much do not have probability easier. the result. special generating Example 9 will this. Chapter 1 29 Example Max and 9 Marco alter nately throw dar ts. The rst one who scores a bull’s-eye wins the 5 3 game. The probabilities that Max and Marco score a bull’s-eye in ever y shot are and the and 8 4 respectively . variable X a Find b What a G (t ) Max throws denotes the is the 1 of number 5 1 successive throws generating expected 3 Their number probability the rst. 3 throws 3 t + × 4 4 of the the before 1 2 = before function of throws 3 game + × 8 × 4 is the 1 game over. and is + × 4 × 4 8 4 t 3 1 3 × 4 3 × 8 × 4 1 5 t 8 3 + 1 × 4 3 × 4 1 × 8 Max scores Max misses and Marco scores Max misses and Marco misses Max 4 5 × × 8 misses scores 6 t 4 3 ⎛ 2 t ⎜1 4 + t + 32 ⎝ and Marco misses and misses Marco or... that geometric there series are that two innite alter nate ter m ⎞ 2 t ⎜ or + ... ⎞ 3 or 8 2 ⎛ 3 or scores Max Notice = result. 8 and × your 5 × Max 1 verify over? and + random 4 t 8 independent variable X 3 t are + ... ⎟ ⎟ ⎝ 32 ⎠ by ter m with di erent rst ter ms ⎠ and equal Use the common ratio. 2 ⎛ 5 3 2 + t ⎜1 32 + t + ⎞ 2 t ⎜ 32 ⎝ 3 ⎛ 2 + ... ⎟ ⎝ 32 ⎠ 2 ⎛ = 3 ⎝ 5 t ⎜ 2 + t 4 24 t 1 ⎞ 32 2 + 5t 32 = ⎟ 24 t × + 5t = 2 3 ⎠ 32 2 1 32 f or mula f or the sum of an 2 3t 32 3t innite t sequence and simplif y your 32 answer. 24 + 5 G ( ) 1 = 29 = 32 3 = The 1 sum equal b Method of all probabilities must 29 to 1, G(1) = 1. I 2 24 t G (t ) + 5t = Di erentiate the probability 2 32 3t generating 2   G ( t ) 24 + 10 t   32  3t function. Use the 2    24 t   + 5t 6t  quotient = rule. 2 2  32  3t 2 768 + 320 t  G ( t )  72t 3 2  30 t + 144 t 3 + 30 t = 2 2  32 3t  2 768 + 320 t + 72t Simplif y = the result. 2 2 ( 32 3t ) 768 + 320 + 72  E  X  = G   1 = 1160 = = 1 38 2  32 3 Calculate expect At least two throws are made before the game made. is 30 over. Exploring further probability G′(1). Notice that we 841  distributions at least two throws to be be Method II Use *Unsaved 1.1 a GDC to nd G ′(1). 2 d 24∙x+5∙x 40 x=1 | ( dx ) 2 32–3∙x 29 40 1.37931 29 | 2/99 At is least Four made before the game Notice die the are face is X tossed. up. and rolled number random of Write Let the nd the until we rolls. variable X denote the probability probability get Write a “1”. the describing number distribution generating The random probability this of generating function is G (t ) random variable X has a probability heads of the to expect be at least made. that random variable X distribution experiment and show of denotes the that the 6 = , 6 A throws we function. t probability 3 that 1E variable A are two coins appear 2 throws over. Exercise 1 two t ≠ 5t 5 generating function 2 G (t ) = , t ≠ 3. Find: 3 − t 4 a P (X = b P (X ≤ 1); c P (X ≥ 3) d P (X ≥ k) Use and 5 the the 0); probability variance of a X is a Ber noulli b X is a negative Find in the expected questions 1, 2 generating the function following to nd the with the probability p; binomial with the parameters p and and the value variables: variable value expected variance of the and r. random variables 3. Chapter 1 31 6 Noddy and Taking tur ns, line. The Eve play they rst one a basketball shoot who a game basketball makes a in at basket a their hoop wins cour tyard. from the a free game. throw The 4 2 probabilities that Noddy and Eve score in any shot are and 7 3 respectively . The Successive random game is variable shots X are denotes independent. the number of Eve shoots shots rst. before the over. 2 a Find the value of P (X b Find the probability = 1) and show that P (X = 4) = 49 variable X c What the d Use to The If we coin is at are probability i the the to to the note by random for the random shots the is before maximum the game number is of over? shots over. variables of the the of nd game independent going function result. number experiment two by your r ule before represented represented x expected independent have we  verify empirical made of look actually Now the be sum is and generating tossing events of number random variable two coins tossing of heads variable X, Y. Both X we one and and and can say coin we can coin Y at 2 have that a we time. say that is the same distribution: 0  1 1 2 2 p i They also have the same 1 generating 1 1 0 G (t ) = G X (t ) = We notice functions that we probability when we G × + t = if 2 we obtain + t two G (t ) = Y 1 ⎝ 2 the course parameter mean We 2 ⎠ a able Poisson of to cars say for 1 + ⎜ ⎝ 2 companion, of number were ⎛ 1 × ⎟ two the earlier, number we ⎟ 2 of at within a generating when we found heads shown 1 2 + t + t = G (t ) X 4 easily For petrol half 1 = ⎠ were arriving 1 ⎞ t distribution. that probability calculated together. ⎞ t we function coins + ⎜ 2 these result generating tossed 2 multiply the ⎛ 1 (t ) X In 1 1 t Y 2 a function: an 2 manipulating example, station hour + Y 4 the the suppose in one mean that hour the is m. number m , was and in three hours the mean number was 3m. The reason why 2 we were the able expected Poisson do value Exploring We further is or distributions transformations. 32 this that the the parameter mean. were were So probability a Poisson parameters calculated then of by nding distributions using the of the given the distribution is corresponding same elementar y probabilities. Example The 0 times at which distributions into the room in room. room. = X + y and a parameters Let random respectively . the relationship Z with a Find Let the 0.3 arrive and 0. variables X random probability between wasp into room respectively . and Y variable Z generating a denote the of = Y ( ( P X = ) r of t = = ∑ P ( Y = ) s ( ies = X P ( Z s ) = k of and = wasps ies nd and the in the wasps possible insects Z is the number wasps Y in the room. down probability f or each generating insect. To f ollow the e ( 0.4 t of and calculation ∑ and 1 more (t ) Z, functions k Z ies ) t s G of number Write 0.1 (t ) independently number and Poisson e r Y Y, by ) 1 t r G total The ∑ arrive number X, r (t ) X modelled them. 0.3 G be Insects the denote functions can = f or easily, Y use (the a number di erent of index, wasps) s. ) 1 t e When we insects in count the number of both k ( = = 0.3 + 0.1) ( ) t −1 = G (t) × G X The 0.3 e ( ) t −1 + 0.1 ( ) t −1 = above illustrates a ver y functions probability distribution of and X probability 0.1 ( t × the room we make no −1 ) e distinction between notice ies that Z and : wasps. Po(0.4). the impor tant that given doesn’t random proper ty depend of on variables, the but is valid distributions. : X 1 −1 ) Theref ore distribution Theorem t Y example all ( e (t) probability across 0.3 e are two independent random variables with the 2 corresponding probability generating functions G (t ) and X 1 If a X = new X random + X  variable then G (t ) = X such G X 2 is (t ) X 2 that = G + X 1 (t ). G X (t ) X 2 × G (t ) X 1 2 Proof: k G (t ) = X ∑ P( X = k )t k r + s = ∑ P( X = r + Use s )t the substitution k = r + s. r + s r + s = ∑∑ r P (( X = r ) ∩ (X 1 = s )) t Rewrite the sum by using variables X and X 1 r ∑∑ r P( X = r )P( X 1 = s )t 2 Since t the variables are independent 2 s r ∑ P( X = r )t 1 the multiplicative probability law. s ∑ r = . s use = indices s and = both 2 P( X = s )t Use the distributive proper ty. 2 s G (t ) X × G (t ) Q.E.D. X 1 2 Chapter 1 33 Theorem the sum  can of be more extended than two to nd the probability independent random generating function of variables. Corollar y + Given that X , X 1 the , ..., X 1 , , n ∈  are n independent random variables with n corresponding probability generating functions G ( t ), G X ( t ), ..., ( t ). G X 1 X 2 n n If a new random variable X is such that X = X + X 1 + ... + X 2 = n ∑ then X k k =1 n G (t ) = G X (t ) × G X ( t ) × ... × G X 1 (t ) = X 2 n ∏ G (t ) X k k =1 The proof This of this corollar y functions simpler of a corollar y helps us random experiments Example Random function to is left easily variable described to the reader calculate which by the the can be same as an exercise. probability modelled random generating as a sequence of variable.  variable is given X denotes by (t) G = a q Ber noulli + pt, trial where p and and its q corresponding are respective probability probabilities generating of a success X and a failure function of in the independent the trial. binomial Use the corollar y random above variable Y that to nd the describes probability a sequence distribution of n such trials. n A Y = X + X + ... + X = ∑      independant G (t ) = G Y describes a sequence of n repetitions of a Ber noulli trial. trials Theref ore ⇒ distribution independent k =1 n binomial X (t ) × G X (t ) × ... × X G Y f ollows a binomial distribution. (t ) X      Use n the corollar y to nd the probability generating factors function. = (q + pt ) × (q + pt ) × ... × (q + pt )     n Notice factors n = Exercise 1 The (q + that we have obtained pt) 1F probability generating functions of independent random variables 2 ⎛ 1 + X and Y are given by the following formulas: G (t ) X ⎛ 2 (t ) t ⎞ . ⎜ ⎝ a + = Y ⎟ 3 ⎠ Determine G (t); X +Y 34 b Find c Calculate Exploring P (X + Y E (X further ≤ 1); + Y ). probability distributions 3t ⎞ = and ⎜ ⎝ 2 G the simpler. ⎟ 4 ⎠ same result much 2 The probability generating functions of independent random variables 3 ⎛ X and Y are given by the following formulas: G (t ) = ⎛ (t ) = Determine ⎠ G (t); + Y b Calculate E (X c Calculate Var (X + which bikes, The can times be 0.25, in modelled 0.15, and independently and Z denote + Y ); by and crossing. numbers Find the probability b Find the probability be the crossing at of All Let that each at of with them them two at a crossing parameters arrive variables X, at functions least given of arrive random generating a buses distributions respectively . the the cars, Poisson 0.05 at Y ). a at and ⎠ ⎟ 2t 3 X 3 t ⎞ t ⎜ ⎝ a ⎟ 2 ⎝ 3 Y t ⎜ X G ⎞ the of kinds crossing. X, of Y, Y, and vehicle Z. will moment. T rafc vital 4 Geometric random variable X has the corresponding Analysis is component a in probability understanding the pt distribution function given G by (t ) = , where p and q are requirements and X 1 qt capabilities probabilities of a success and a failure respectively . Use to nd the probability distribution function of binomial random variable Y that describes a are independent trials until we achieve r many the trafc successes. but a X and X 1 independent Binomial variables with the none of efciently of them networks, can capture the trafc 2 same of are for sequence characteristics 5 trafc proposed analyzing of network. the models negative a the There corollar y of probability repetitions, n p, however and n 1 generating they have respectively . dierent Use the numbers characteristics of probability networks to show that X + X is a types ever y possible Binomial circumstance. 2 However , and all 2 function 1 variable of under hence nd the one of the most parameters. widely used and oldest + b Given that X , X 1 random , ..., 2 variables X , are k ∈  independent Binomial k having trafc the same probability p models number of repetitions, n , 1 n , 2 …, n the but Poisson dierent is Model. respectively . k k Use mathematical induction to show that Y = ∑ is X also a i i =1 Binomial distribution and nd the parameters. Chapter 1 35 Review EXAM 1 A exercise STYLE QUESTIONS random variable X has a probability generating function 4t G (t ) = , 5 2 a P (1 b P (X c E (X ); d Var (X ). The ≤ X ≥ times 0.28 each ≤ a 4); rabbit, by fox, of All random them at the probability b Find the expected c Find the probability In a cer tain at a Helping i the ii the the four th the rst to What will is A if continuous distribution 0, ⎧ ⎪ of at 75% 0.6, Z denote the animals least two at of X, the Y, = animals will be of volunteers Students student from are the involved , 0 school are is the rst one is the second who is the not selected will exactly not six need need random who occur than number six X involved 3rd involved 10 of involved variable is before students more expected we student one 36 Find b Hence c What Exploring the < with selected is the given the x ≤ > ≤ further selected to value modal of by the cumulative value probability a probability of the density function. variable X ? distributions be programme? 2 the is students? needed 2 positive the the programme 0 determine is with student; ⎩ a the the selected students in with 2 x at involved who a 1 , Z. programme; ⎪ ⎪ the that: 2 ⎨ at meadow . ⎪ x F( x ) be numbers and function x can 0.12, independently and functions probability selected with the selected the student select we b that selected programme iv meadow programme; involved iii is Y, a parameters arrive X, at moment. programme. What them number given four th with of arrive with generating community , random. deer meadow . the a a variables Find at and distributions a Peer 4 a Poisson Let meadow 3 Find: respectively . meadow . of ≠ 5. 2); modelled and t t 5 A random variable t G (t ) = , a t ≠ , bt a Show b Hence X has a probability generating function a where a and b are nonzero integers. b that b nd = a – E(X ) 1. in terms of a 2 c Show a X 6 that and X 1 Var (X ) are = a – a independent Poisson variables with the same 2 parameter that + X m. X 1 Use is a probability Poisson generating variable and functions nd the to show parameter. 2 + Given b that X , X 1 random , ..., X 2 variables , are n ∈  independent Poisson n having the same parameter m, use n mathematical induction to show that Y = ∑ is X also a k k =1 Poisson The the of French probability. A which area: the normal 1711 Method He found is that on in famed summing die he of contained distribution Moivre would of his he was the the most the case sleeping day that he parameter nm Moivre ‘The the a 15 minutes for 24 Events contribution of his by of He this the death. each calculated hours. to in trials. own longer pioneered theor y of of number day progression, slept the distribution large the and Doctrine signicant of (1667–1754) Probability binomial predicting arithmetic the geometr y published to the for De analytic Calculating approximation De and, In with mathematician development Chances: Play’ distribution night that was he right! Chapter 1 37 Chapter  Cumulative  Given the  For distribution random probability function summary variable function F :  continuous cumulative is random distribution X (discrete → [0, 1] the P :  → [0, 1] function F (x ) = P( X variables function continuous) cumulative ≤ the F or x ), x the corresponding distribution ∈  probability are and related as density function f and the follows: x F ′( x ) = ( f x ) ⇔ ( F x ) = f ( t ) dt ∫ −∞ Geometric  A discrete distribution random variable X k X ∼ Geo( p ) Expected if P ( value X = k and ) = is said to have a geometric distribution and we write 1 q p, where p ∈ , 0 ≤ p ≤ 1 , q = 1 − p, k = 1 , 2, 3, 4 ... variance 1  Given a geometric random variable, if X ∼ Geo( p ), then E ( X ) = q and Var ( X ) = 2 p Negative  A binomial discrete random distribution variable X is said to ⎛ k have X ∼ NB( r , p) if P ( X = ) k = ⎜ ⎝ 0 ≤ p ≤ 1 , q = 1 − Probability  Let X P (X be = a k) p, k r , r + 1 , r generating discrete = = p , k = random 0, , 2, + 2, ⎟ r r a negative binomial distribution and 1 r q r , p where ⎠ + 3,... function variable 3, … are assuming the nonnegative corresponding integer values probabilities. Then and the function k ∞ k of the G (t ) form = ∑ p t 2 = k p + 0 p t + p 1 3 t + p 2 t generating PGF Ber noulli X ~ B(, p) G (t) = q + Binomial X ~ B(n, p) G (t ) = (q G (t ) = e G (t ) = pt n m Poisson X ~ Po(m) + ( t pt ) ) 1 pt Geometric X ~ Geo ( p) 1 qt r pt ⎛ Negative binomial X ~ NB (r, p) G (t) = ⎜ ⎝ 38 Exploring further probability distributions ⎞ ⎟ 1 − qt ... + p t n function. Distribution n + 3 k =0 probability we 1⎞ k write p ⎠ + ... is called a Properties i) G () = ii) E (X ) iii) Var (X ) of PGF X and Expected value = = G ″() are + G ′() (− random two G ′()) variables independent random variables with the corresponding probability 2 generating functions G (t ) and G X X  + X 2 then G (t ) X ( t ). If a new random variable X is such that X 1 = Variance G ′() X  X and  Independent  and = 2 G (t ) X 1 + X = G (t ) X 2 1 × ( t ). G X 2 Chapter 1 39 Expectation algebra and Central Limit 2 Theorem CHAPTER Linear 7.2 OBJECTIVES: transformation of a single random variable. 2 E ( aX + Mean b) of = aE ( X ) linear Variance of A linear of Y ou 1 you should Given value that and the In product ∼ of + b) of of = n a the Var ( X ) random of n variables. independent independent independent par ticular , Central random normal Limit = how to: nd B(5, 0.3) standard np ⇒ variables. variables. random E( XY variables is ) = E( X ) E(Y ) normally Theorem. deviation Skills the of expected 1 check: Given E( X ) = 5 × 0.3 = that X ∼ B( n , X with p) 3 E( X ) E( X ) random start know X Var ( aX combinations combination distributed. Before b, combinations linear Expectation 7.4 + = 2, Var ( X ) nd = the following 2 1.5 probabilities: Var ( X ) 2 A = ⇒ σ npq random variable = X 5 × 0.3 × 0.7 has = 1.02 the a 2 A P (X 2); = random P (1 b variable X has ≤ X ≤ 3) the 2 following following parameters: parameters: 2 E( X ) Find = a the variable The Var ( X ) value of follows and E( X ) the = = (2a 2a = such 2a between Var( X ) of a ∈ . the random distribution. the a expected Poisson therefore we variable write 2 − 1 ⇒ 1)( a 1) 2a − a 0 − 1 = 0 or a a = 1 2 40 Expectation algebra and Central = 7a and 3 −1 , that Poisson variance 2 a a a relationship value is and E( X ) Limit Theorem Var ( X ) value = of follows a a 6a − such 9a − that Poisson 2, the a ∈ . Find random distribution. the variable Normal distributions and probability models Sometimes, In probability theor y , variable one variable could would be expected expect to repeated value obtain an refers if endless the to the value experiment number of of a involving times. chance, random This this intuitive crashes period of of the expected value is a direct consequence of of Large Numbers, i.e. it is the limit of the sample mean as size approaches innity . This value may not be the of the English sense of this word, rather it could even Dramatic impressive debris media may shown make ‘expected’ ner vous in plane shor t the in sample a by the pictures Law and few in time. headlines explanation a occur just be unlikely , yers feel fearful. or However , aviation safety counter-intuitive. data Perhaps some random variable of the are most impor tant expected value concepts and in the variance. study We will of reveals reality a investigate algebraic proper ties in the aspect of so-called Fur thermore, we will be looking at linear is the combinations proper ties At the results the end in random of the of variables parameters this chapter mathematics, impor tant role that you the and of will independence combined study Central normal how such Limit one the Theorem, distributions aect random of have truth: safest it air has in the in histor y of of aviation. independent different perhaps expectation been algebra. a a counter-intuitive travel their and The average the number of take each ights that variables. most and impor tant large, get discover of devising off a and ver y reality day data is analysts different given ver y by picture the numbers. probability parameters such as models. in The expected probability regression value distribution, analysis, which we and and will variance in are statistical study fur ther impor tant topics, in Chapter 4. Chapter 2 41 2.1 In Expectation statistics, and these If ● In If If If k is of each of k tendency (variance, the data the of the new member data new of to data the median, deviation, measures conducted each data added standard these we (mean, etc.) etc.) represent two are investigations that: to new of is the noted added the value constant of we member mean a central companion and value variance ● core constant each the ● values changes, a of dispersion the the mean ● of when adjusted. on measures measures change algebra member set set also is data each set data the increased multiplied set is k is of data set the by k; by times member doesn’t set of a the the constant k, original data set mean; the change; multiplied by a constant k, 2 the new Since random given by variance of variables the allow data us is to times k model the original random variance. phenomena Linear sets of data, we expect that the mean value and of of the set of data will behave similarly to the expected transformation variance value a variable is transformation variance of a random variable. obtained and Linear transformation of a single Luca will a and give Find b his him the Suggest play expected later of a a game. Luca ips a coin. and the variance Luca will get in decides If he gets him with and the variance Luca will get in relationship and award between the of this the random €5 of if the new he x by constants. random version expected values of variable X a his father that describes that describes the variances of the gets nothing if he obtains a “tail”  1 1 2 2 the gets €1 if he probability obtains a distribution “head”. table variable X and ll in all the ∑ x p i Calculate the expected values. value. Calculate the variance. i i =1 1 E( X ) = 0 1 × + 1 × 2 1 = 2 2 2 2 Var( X ) = ∑ x 2 p i − (E ( X )) i i =1 2 ⎛ Var ( X ) = ⎜ ⎝ 42 Expectation 1 2 0 × 2 algebra 1 ⎞ 2 +1 and × ⎟ 2 ⎠ Central ⎛ 1 ⎞ ⎜ ⎟ ⎝ 2 ⎠ Limit 1 = 1 − 2 Theorem 1 = 4 4 Draw f or ) 2 = the game. i E( X ) the “head”. variable Y the and and x “head” Y 0 i = a obtains Luca P( X the game. value money X to this a = of variable value money expected variables X addition €. of the amount c father father Find is  amount Luca’s by that multiplication variable Example a and the Luca b Y = y 0 i 1 P (Y = y nothing if he obtains a “tail” and Draw the gets €5 if he obtains a probability “head”. distribution table 1 ) i gets 5 f or 2 the variable Y and ll in all the 2 values. 1 E ( ) Y = 0 1 × + 5 5 × = 2 2 Calculate the expected value. Calculate the variance. 2 2 ⎛ Var ( Y ) = ⎜ 1 2 0 × E ( Y Var ) = ( Y 2 5E ) = ( ⎛ 5 ⎞ ⎟ ⎜ ⎟ 2 ⎠ ⎝ 2 ⎠ 25 25 = 25 − = 2 4 4 ) X 25 1 ⎞ × + 5 ⎝ c 2 By ( Var the ) X looking y variables = 5 x i , that increased variance the let’s consider Example Luca “head” a Find his but the amount b Suggest (in variation of the father again regardless expected of a of the a similar outcome game. and the variance will get in and between the Luca’s obtained Luca relationship ), play value money Example z Z the by = write Y expected the was this of he the father will will give same value f actor increased random 5 X by was and the the square of z i him an € extra variable Z that if he obtains €2. describes the game. expected values and the variances of the variables X gets €2 if he obtains a “tail” 3 and = give him gets Draw ( that we Z 2 i P so of factor. Luca = notice 2 values game. a Z 1, possible  and a another we = i the i Notice Now at 1 1 2 2 €3 the if he obtains probability a “head”. distribution table ) f or the variable Z and ll in all the values. 1 E ( Z ) = 2 1 × + 3 5 = × 2 2 Calculate the expected value. Calculate the variance. 2 2 1 ⎛ Var ( Z ) = ⎜ 2 2 × + ⎝ 5 b E ( Z ) = 1 ⎞ 2 3 × 2 ⎟ − 2 ⎠ ⎛ 5 ⎞ ⎜ ⎟ ⎝ 2 ⎠ 13 = 25 − 2 1 = 4 4 1 = + 2 = E ( X ) + We 2 can see that = z + x i 2 2, i = 1, 2 i 2 so we write Z = X + 2 . 1 Var ( Z ) = = Var ( X ) Notice that the expected value was 4 Example 3 looks at one last variation of the increased by while variance the the same added remained value the same. game. Chapter 2 43 Example Luca’s a  father Find the amount Luca’s ips b a of father Find the money then he will give Luca €0 value and the variance Luca will get in decides and expected of Suggest R that to make it the if he of gets the a “head” random a money more realistic and introduces value and the variance Luca will get in relationship between the this of the random variation expected of values the = r 0 ( R = r and the €3 tax gets 0 = Luca ⎛ Var ( R ) 10 × = ⎜ 1 2 0 = 5 1 ⎞ 2 × + 10 × 2 and ll i Draw in − 3 the if 1 1 2 2 × + 7 2 table 5 = 50 − 25 = 25 all the the the variance. loses €3 and obtains a if he gets €7 R value. obtains (€10 “head”. probability × ⎜ 1 2 ( −3 = = 2 – Draw ) 1 ⎞ 2 × 7 + × 2 5 S distribution and ll in a €3) if he the = 25 the expected − 3 = ⎟ = E( R ) − − 4 = values. value. Calculate the variance. 25 3 We = Var so can ( R) we see can that write = s that by suggest that all the changes from the the the S r valid for random variables too. The these transformation of results a about random the expected the remains Central Limit Theorem value value 2 the was that variable, was whilst the same. that of a random the probability variable taking of a par ticular outcome remains the when perform same linear the and 1, following a algebra = 3 data parameters variable. R same from a formalizes 3, i i = Notice Expectation the 2 variance are the f or 58 2 2 − 2 ⎠ subtracted examples table all Calculate reduced theorem variable values. Calculate Notice adjustment a probability f or i (S ) obtains 2 ⎝ Var he 1 ⎛ (S ) = = a ) 2 E(S ) €10 expected variable Var obtains 7 1 = gets the “tail” s he 2 ⎠ −3 i = if Calculate Luca s E(S ) variables 2 − ⎟ b (S the 2 ⎝ = the 1 + × 2 44 whenever describes of nothing distribution 2 1 E( R ) that variances “head”. 1 and linear the ) i 2 These coin. 0 1 c a describes game. “tail” P a variable S Luca S that S i P ipping game. a R when variable R coin. amount c said expected we transformation random variable. on Theorem 1: Given that and b ∈ , a, ( i E aX ii Var X is a random variable with nite parameters then + b ) = aE ( ) X + b; 2 ( aX ) + b = a ( Var ) X Proof: As random prove Case Let i the theorem can for be both either discrete or continuous, we must cases: I X μ variables be = a discrete E( X ) E( aX + b) = x P( X ∑ = ( ax ∑ = random ( P( X X Then: x ) ⇒ + b) ax P ∑ = variable. = x ) = + x ) ∑ = bP ∑ ( X ( ax P( X = x ) = = a x ) + b P( X ∑ x P ( X = = x x )) ) + b ∑  E = aE (X ) + ( ) X = E (X Var ( X ) = ∑ X = x ) ) 1 2 ) − (E ( X )) 2 ii X ( b 2 Var ( P      ,E (X ) = μ 2 x P( X = x ) μ − ⇒ 2 2 Var ( aX + b ) ( = ∑ ax + b 2 = (a ∑ ∑ (a = x ) − (a μ + b) 2 + 2 axb + b 2 )P ( X = x ) − 2 μ (a 2 + 2 a μb 2 x 2 = P (X 2 x 2 = ) ∑ = x ) + 2axb P ( X = x ) + = x ) b 2 P (X P (X = x ) + 2ab ∑ x P (X E 2 a ∑ ( x i μ E = ( Let E( X ) aX + b X = ) be ∫ = a xf ) = + 2 ( X = x ) − (x ) Var dx + b) − + b X ( X ) b P (X ( X ) + 2 + 2a μb = x ) − (a μ + b ) 2 2 + 2a μb + b )      1 2 2 − 2 − 2 − = a b Var( X ) ⎟ ⎠ random variable. Then: ⇒ f ( x ) dx = ∫ ( axf (x ) + bf ( x )) dx = a xf ( x ) dx  E aE 2 μ 2 ∑  = (a 2 ⎞ μ      continuous ∫ ( ax + 2 P ⎝ II x )) ) 2 P 2 ⎛ ⎜ Case ( 2 ∑ 2 = = 2 x    = ) 2 P (X 2 a + b ( X ) + b f ( x ) dx ∫     1 b Chapter 2 45 2 ii Var ( X ) = ∫ 2 x ( f x ) − μ dx ⇒ 2 2 Var ( aX + b) = ∫ ( ax + b) 2 = ∫ (a f ( x ) dx (aμ − 2 + b ) 2 x + 2abx 2 + b ) f ( x )dx 2 μ − (a 2 + 2 ab μ + b ) 2 2 2 2 2 2 = a x ∫ f ( x )dx + xf 2ab ( x ) dx 2 = ( (∫ ) X − (a μ + 2ab μ 1 2 + 2 a ( x )dx      + ∫ f ∫ 2 2 a b    E = + ∫ 2 b 2 − 2 − 2 b 2 x f ( ) x μ dx )  Var ( X ) 2 = Exercise 1 Given Var ( X ) 2A that variance a a X 1.2, 3X is a nd b random the X + variable expected 3 value 4X c with + the and 1 expected variance 2X d − value of 5 the and following: kX e 5.3 + p; k, p ∈  2 2 Given that a random variable X ∼ B nd: 10, 5 a 3 A E ( 3X + 2 random ) ; Var b variable Y ( follows 3X a 2 ) geometric distribution with the 2 parameter p .  Find the expected value and variance of 2 Y 1 3 4 Given a E(3 that − a random 2Y ); variable b Var ( Y ~ 3 Po(2) 2 Y nd: ) 1 5 Given that a random variable X ∼ NB , nd: 8 3 a 6 E 2X Given nd 7 ( A 3 that Var ( a 5X ) ; b random + 3 continuous Var ( 2 X variable X 11) ∼ B (15, p ) and E ( X ) = 6, ) random variable X has a probability density ⎧1 ⎪ function given by the formula f ( x ) = ⎪ ⎩ Find 46 the exact Expectation values algebra and of the Central expected Limit x, 0 ≤ x ≤ 6 0, otherwise ⎨3 value Theorem and variance of 3X + 2. + b ) So far Let’s We we see are were what going Linear Hannah X a 8 Y a “head”, than the a transformation involve the two he two or variables or more are more than of of just one random variable. variables. independent. variables to she from their will the table between of he the ipping obtains and they her ear n father, random values will get. that the x 0 i P (X = get €2 if variables respectively . game. value money Hannah = this will and variance will get 3 of Z get times more together. describes expected and game Hannah Luca their Random this in will value the and variances of amount variance the of of W variables X, W a X in expected variable W the Hannah “head”. get and put Calculate expected a coin. together the they they a Luca calculate Again time. the if amount from father. this and the received together €5 by Hannah amount increase his father get money distribution earn relationships and Luca distribution decides probability with represents received will game amount Z money they Suggest Z, of probability more Find linear whilst the grandmother times Y, that play variable the money c Luca represent Draw money b and random Their assume a we  obtains and The to if transformation Example she performing happens x gets nothing if she obtains a 2 1 1 2 2 “tail” and Draw the gets €2 if she probability obtains a distribution “head”. table f or ) i E ( ) X = 1 , Var ( X ) the = 1 variable Calculate Luca Y = y 0 i the = y and ll expected nothing in all value if he the and values. variance. obtains a “tail” 5 and P (Y gets X 1 1 2 2 gets €5 if he probability obtains a “head”. distribution table Draw f or the ) i variable 5 E ( ) Y ( Var Y ) = Calculate 2 Z = = X + Y If z i they 0 2 5 7 = z in expected 1 1 1 1 4 4 4 4 of the the values pairs ) put adding The P (Z ll all the values. value and variance. 4 are Z and 25 , = Y of money money Z f or together they the then we ear n. cor responding outcomes: i ( T, T ) ( T, 1 E ( Z ) = 0 1 × + 2 × 4 1 + 5 × 4 1 + 7 × 4 7 H) → → 0 0 + + 0, 5, ( H, T ) ( H, H) Calculate the expected Calculate the variance. → → 2 + 2 0 + 5 value. = 4 2 2 2 ⎛ Var ( Z ) = 0 4 7 + 4 49  4 2 5 + 4 78 = 2 2 + ⎜ ⎝ 2 4 ⎞ ⎛ 7 ⎞ ⎟ ⎜ ⎟ ⎠ ⎝ 2 ⎠ 29 = 4 4 Chapter 2 47 W b = 8X + Again 3 Y we are ear n. The W = w 0 i 5 6 adding values ( T, T ) 1 (W = w 1 1 4 4 ( ) W = 0 × 1 + 15 1 × + 4 the pairs + 0 0, of ( T, outcomes: → H) → 8 × + 2 + 0 3 × 5 0, 4 ( H, 1 f or they 1 ( H, T ) 4 E → ) i W money 3 cor responding P of the 16 1 × + 4 31 × = 4 → H) 8 × + 2 3 × 5 31 4 Calculate the expected value. Calculate the variance. 2 2 2 ⎛ Var ( X ) = 2 0 2 15 ⎝ 4 + 4 721 = ⎞ 31 + + ⎜ 2 16 − ⎟ 4 961 ⎛ 31 ⎞ 4 ⎜ ⎠ ⎟ 2 ⎝ ⎠ 481  = 2 4 4 5 E c ( X ) + E ( Y ) = E ( X ) +Y = E ( ) Z Notice that 7 + 1 = 2 2 25 Var ( X ) + Var ( Y ) = Var ( X + Y ) = Var ( Z ) Notice that 29 1 + = 4 4 5 8E ( X ) + 3E ( Y ) = E ( 8X ) + 3 Y = E ( W ) Notice that × 8 1 + 31 3 × = 2 2 25 64 Var ( ) X + 9 Var ( Y ) = Var ( 8X + 3 Y ) = Var ( W ) Notice that × 64 1 + 9 481 × = 4 Now let’s take Example Mia and another example, this time with probabilities Theo draw marbles from two boxes. Mia contains 2 white and 3 black marbles. Theo it contains 4 white and 3 black marbles. Random white marbles represents a Draw the the variance Their Mia b Find the X, Y, Z, of probability of all the a decide to are going marbles distribution to put X, give the Y, tables €3 money and between to of draws two a marble marbles variables X draw and and x i and respectively . calculate the = x from form Y the the rst The box; second represent random expected Theo €2 for each the random the expected variable W expected white values that value and value the box; number variable Z and marble describes and variance variances the they draw . of the the of amount of W variables W can 0  3 2 Draw 5 5 f or marble P (X distributed. together. Mia = uniformly together. a X not Z calculate the draws draw , they and Mia distribution together, relationship and white decides ear n Theo variables probability they Suggest and number Theo money c Mia grandfather and are  it of that 4 draw from the one the black rst box probability or one of white marbles. distribution table ) i the variable X and ll in all the 2 2 E ( X ) 2 , = Var ( X ) 5 48 Expectation and 2 ⎞ ⎜ ⎟ − = 5 algebra ⎛ Central ⎝ 5 ⎠ Limit 6 values. = Calculate 25 Theorem expected value and variance. ⎛ 3 ⎞ ⎛ 3 ⎞ ⎛ 4 ⎞ ⎜ ⎜ ⎝ P (Y = 0) ⎟ 2 1 ⎠ = = ⎜ ⎝ 2 1 ⎝ , ⎛ 7 ⎞ ⎟ ⎜ P (Y = 1) ⎠ ⎝ Theo 1 one = ⎜ ⎠ ⎝ black marbles or black and one white or white from the marbles second box 7 ⎟ 2 two , two ⎛ 7 ⎞ 7 draw 4 ⎠ = ⎟ can ⎟ of marbles. ⎠ ⎛ 4 ⎞ ⎜ ⎝ P (Y = 2) ⎟ 2 2 ⎠ = = ⎛ 7 ⎞ ⎜ ⎝ 7 ⎟ 2 ⎠ Draw Y = y 0 i  f or P (Y = y the probability distribution table 2 1 4 2 7 7 7 the variable Y and ll in all the values. ) i Calculate the expected value and 2 8 ( E ) Y 4 ⎛ = ( Var , Y ) = 7 8 ⎞ + ⎜ ⎟ ⎝ 7 ⎛ 8 ⎞ − ⎜ 7 ⎠ variance. 20 = ⎟ ⎝ 7 ⎠ If 49 they we Z = X are put 3 ( Z = 0 ) 1 × = 5 actually ( ) Z = 1 Z = 2 ) 4 Z = 3 ) 3 2 35 = 14 = 1) ↔ (1 + 0) or (0 + 1) = 7 35 ( Z = z i ( Z ) = of + 0) ( Z = 2) ↔ (1 + 1) or (0 + 2) ( Z = 3) ↔ (1 + 2) 35  2 3 3 2 2 4 35 5 5 35 the the probability variable Z distribution and ll in all table the values. ) 3 E (0 5 f or P ↔ 2 = 0 i 0) 4 7 z = ( Z Draw Z number ( Z 5 = × 5 drew. = 2 × 5 2 = 14 = 7 + 7 2 ( 4 × 5 × 5 P 3 7 = they 35 + 2 ( the then 3 1 × = 5 P adding together = 7 2 P marbles +Y marbles P the 0 2 + × 1 × 35 2 + 2 × 4 + 5 3 54 × 5 = 35 Calculate the expected value. Calculate the variance. 35 2 2 ⎛ Var ( ) Z = ⎜ 0 + ⎝ + 5 106 = W = 3X + 36 ⎞ ⎛ 54 ⎞ ⎟ ⎜ ⎟ 35 ⎠ ⎝ 35 ⎠ + 5 2916 794 =  35 b 8 1225 1225 This 2Y they W = w i 0 2 3 4 5 7 3 12 2 6 8 4 = w 35 35 35 35 35 35 ) we ear n, marbles W P (W time f or not they the are adding the the number draw. The of money white values cor responding pairs of of outcomes: i 0 + 0, 0 + 2, 3 + 0, 0 + 4, 3 + 2, 3 Chapter + 2 4 49 24 E ( ) W = 0 6 + 24 + 40 + 35 28 + 35 + 35 122 35 35 ( ) X = 0 ⎜ 18 + 558 200 + 35 ⎝ 96 + 196 ⎞ + 35 + 35 14, 884 the expected Calculate the variance. ⎛ 122 ⎞ − ⎟ 35 35 ⎠ ⎜ ⎝ ⎟ 35 ⎠ 4646 = = 35 1225 1225 2 E c ( X ) + E ( Y ) = E ( X ) +Y = ( E Notice ) Z 8 = 7 6 ( ) X + Var ( Y ) = ( Var X ) +Y = ( Var Z Notice ) 54  that 5 Var value. 35 2 48 ⎛ Var Calculate = 35 20 794 =  that 25 49 1225 2 3E ( ) X + 2E ( Y ) = E ( 3X ) + 2Y = E ( W Notice ) that 3 × 8 + 2 = 5 7 6 9 Var ( X ) + 4 Var ( Y ) = ( Var 3X + 2 Y ) = Var ( Notice ) W that 9 the To and probability nd the X 4 the 5 of demonstrated the corresponding and sum probability the + second the part of involving of two ver y impor tant independent variable Z = X probabilities  of proper ty random Y we the a real of = z ) = P( X Examples variable W of X 4 = and Y, = x ) and (which coecients, W probability variables. multiplied random variables 5, was 3X and is × + P(Y we a 2Y ) y) showed linear totally = that the combination was simply the independent of X and product of E(X ) the X. also = demonstrated w ) that = P( X the = x ) expected × P(Y value = y) and The combinations of multiple random variables value and variance of linear of − X ) random variable. We can now state in the same combinations the of the variance is moment, of following also of X called which linear the spread way relates to variability, a volatility, single called moment second E(X and expected also 2 variance behave is rst measures the 1225 for the P(W as = 49 coecients: We 4646 × 4 Y : probability Y of the elementar y P( Z In of one 35 20 × 25 Examples 122 × and uncer tainty. theorem. We can also dene the n Theorem nth moment: E(X − X ) For example, the third 2: 3 Given nite that X and parameters Y are and two a, independent b ∈ , random variables with moment: measure then: or i E ( aX  bY ) = aE ( X )  b E (Y E(X of 50  bY Expectation ) = algebra a 2 Var ( X and ) Central  b Var ( Y Limit ) Theorem is a skewness asymmetr y distribution Var ( aX X ) in ); 2 ii − of X the Proof: We i are μ going = to E (X ) prove = x P (X ∑ 1 this theorem = for μ x ), = a discrete E (Y ) = y P(Y ∑ 2 x random = variable only: y) Since X and independent ⇒ E ( aX ) + bY = ∑ x ∑ x are ( ax ) P (( X + by = x ) ∩ (Y = y )) the variables probability of the y intersection = Y y (( ax + by )P( X = x )P(Y = is the product of probabilities. Use distributive y )) y ( ax ∑∑ x P (X x ) P (Y y) by P (X x ) P (Y y )) the y proper ty. = ∑ P (Y = ) ( y ax ∑ y P( X = x )) The sum of all the x  probabilities of a 1 ∑ ( P X ) x ( by ∑ x P (Y random variable equal 1. is y )) to y      1 = a ∑ x P (X = x ) + b aE ( ) X ( Var ( X ) = ∑ (x y ) Use     ) X E + bE ( Y the denition of the ( Y expected value. ) ) 2 ii = y E = y P (Y ∑ x    2 P (X = μ x )) − 1 x 2 Var ( Y ) = ∑ ( y 2 P(Y = μ y )) − ⇒ 2 y 2 Var ( aX + bY ) = ∑ ( ax + by ) P (( X = x ) ∩ (Y = y )) Since X and Y are x , y independent variables, 2 − (a μ + bμ 1 ) the probability intersection 2 = ∑ of the 2 (( a 2 x 2 + 2 axby + b is the 2 y ) P (X = x ) P (Y = y )) product of probabilities. Use distributive x , y 2 2 μ − (a 2 + 2a μ 1 bμ 1 2 μ + b 2 ) the 2 proper ty. 2 = ∑ P (Y = y) ∑ y (a 2 x P (X = x )) The sum of all the x  probabilities of a 1 +2ab ∑ (x P (X = x )) ∑ x ( y P (Y Recognize the X = x expression f or the 2 2 ( 1. to μ 1 P equal is y ))      μ ∑ variable y     + = random ) x ∑ (b expected 2 y P (Y = value. y )) y      1 2 − (a 2 μ 1 2 + 2a μ bμ 1 + b 2 2 μ ) Reduce opposite terms 2 and collect the like terms. Chapter 2 51 2 2 = 2 ( P ∑ ) = +2 + b 1 2 ∑ 2 x 2 2 μ −2 a μ 1 bμ 1 2 = ( = y )) 2 2 (x ∑ 2 μ − b 2 2 a P(Y y 2 −a ( y P( X = μ x )) − 2 2 ) + b ( 1 ∑ x ( y 2 P (Y = y )) − μ ) Use distributive 2 y proper ty 2 = a and the 2 ( Var ) X ( Var Y + b ) denition of the variance. The proof variable is Theorem nite of the theorem beyond 2 can number Theorem the now of for scope be a continuous of this generalised random random syllabus. to give a result for a variables. 3: Unlike + Given X that , X 1 , X , ..., 2 X 3 , n ∈  of are independent with nite and a parameters , a 1 expected X 1 + a 1 X 2 , a 2 , ... a 3 ∈  that X 1 + ... + a 2 + a 1 X n X 2 ) = a n + ... + a 2 2 E (X 1 ) + a 1 X E (X 2 ) + ... + a 2 E (X n ); n whether and b it 1 ) + a 1 are sigma ) + ... + a 2 Var ( X n notation are we can always ⎜ ∑ ⎝ i =1 a X i = i ∑ ⎟ ⎠ n ∑ ⎝ i =1 X i of Exercise in i 2 = i ⎟ ∑ ⎠ i =1 this a Var ( X i ) i theorem induction for and the is discrete left as an Z are case can exercise be for done by students. 2B random the i ); n ⎜ proof The E (X ⎞ a mathematical 1 a i =1 ⎛ The table variables X, Y, and given with their parameters below . 2 Variable σ X 3 0.5 Y −5 .4 Z 2 2.8 Find 52 have In to be E(X ± Var (X that Y ) = ± Y ) E(X ) ± E(Y ) and ⎞ Var ii positive. we write: n ⎛ E or squares ). n careful i positive their 2 Var ( X 2 n matter coefcients n par ticular Using doesn’t the ) n 2 Var ( X a we calculating then: negative; = value, when n a Var ( a ii calculation random variance E (a i the n notice variables in the a X + d X  Y parameters Y + Expectation Z algebra of the b 2 Y e X and following Z + Y Central  Z Limit linear combinations: c 2Z f 3Z Theorem 7X  2X + 4Y = Var(X ) + Var(Y ) Poisson 2 E(Y ) E a ( 5. 3X E(Y ) = nd the and 5 ) ; 4. Given of binomial ) = 12. the Use mathematical yet variance distributed random variable Example Given the Y are such that Var (X ) = 2 and 3Y − Y in linear way is in are terms have to such same terms of of prove have distributed E(X ) = of 9 and success, p, p Y are same such that probability E(X ) of Theorem we have of this explored variables Poisson in 3. so far, which distribution) two X + b X − Y c 4 X a E ( are will Poisson the random variance of variables the X random X ∼ Po ( μ ) and ∼ Po ( μ Y +Y ) = E ( X ) + E ( Y ) = μ + ( X +Y ) = Var ( X ) + Var Both 2 ( Y ) = μ + the Poisson μ X −Y ) = E ( X ) − E ( Y ) = μ − 1 expected random X −Y ) = Var ( X ) + Var ( Y ) = μ + 4 X − 3 Y ) = 4E ( X ) − 3E ( Y ) = 4 μ = 16 μ 1 We can random + − 3 Y ) = 4 Var ( X ) + to the Y of the equal variance. could 2. to Notice random Hence, also apply Theorem ( −3 ) its of the random be of a the 2. that variable we the X + might Y is speculate Poisson. Notice variable variance. Thus, Again − 3μ that the − is expected X apply Theorem −Y X 2 is and not Y not equal Poisson. notice that 2 2 2 4 X variance are 2 1 ( and μ to Var expected 2 value ( value variable Apply Theorem value X + Again μ 1 ( the 2 that E nd μ equal c ) 2 expected Var a way . parameter. ( give variables: 1 E all 3 Y 1 b haven’t Y − Var we  and a 8 success, p, 1 value = p combination all that probability and the variables (e.g. also Y the they random same and variables X induction a ) 7X have that X of that − X random whether in 2X of manipulation discussed they Given nd the and ( 11 Y variables that variance E(Y X Var b random Negative 4 variables Calculate: + 5 Y Binomial 3 In random = Var ( Y the expected value of the 4X − 3Y is not equal 4X − 3Y is not Poisson. random variable ) to its variance. Thus, 9μ 2 conclude variables that, is in not a general, Poisson a linear random combination of two Poisson variable. Chapter 2 53 In a two similar situation random we variables ask have ourselves equal what expected is going to happen = values and 1 2 equal 2 2 σ variances if σ = 1 . In this case we conduct a slightly dierent 2 calculation. In the following variance can of a Mia to the Mia’s uncle Their aunt Z expected Given + X use to the expected investigate value whether says box says we of Y the and can the 2 the will give and a €2 the write Y = variables a of of or of colour black 0 of X = x marble, marbles and then retur ns it Mia draws. X. for of each white money marble Mia the for money same from each they box the and box, white she receives will notes notes marble get from and and Z = X + X , what do draws. from her uncle. the the colour. colour drawn. their The aunt. He and then replaces it. random Find the you notice about the Z ? probability distribution table f or the  variable ( the marbles. white The x i P we Y marble € 3 of Mia of a = not Z 2X Y and amount from them amount variance to variance draws the number variance marble give notes white and Mia will She the denotes draws denotes that he value she box. contains value that Theo, a represents variable value parameters X and X from Afterwards, variable d X X expected brother, it. The expected the replaces = variable random Find 2X will variable marble box. the Mia’s The c one Random Find b that we  draws back a random conclude Example example, 3 2 5 5 X is the same as that in Example ) i 2 2 E ( X ) 2 , = ( Var X ) = ⎛ 2 ⎞ − 5 5 ⎜ 6 Calculate the expected Draw probability value and variance. = ⎟ ⎝ 5 ⎠ 25 b Y = y 0 i P (Y = y 2 3 2 5 5 the the variable Y and distribution ll in the table f or probabilities. ) i 2 4 E ( X ) = 8 , Var ( X 5 54 Expectation algebra ) = ⎛ − 5 and Central 4 ⎜ ⎝ ⎟ 5 24 ⎞ ⎠ Limit Calculate = 25 Theorem the expected value and variance. 5. c Draw Z = z 0 i  f or P (Z = z the probability distribution table 2 9 12 4 25 25 25 the variable values. We are Z and ll actually in all adding the the ) i 3 ( P Z = 0 ) 3 = = 5 2 ( ) Z = 1 3 ( Z = 2 ) 5 ) Z = 0 ( Z = 0 ) ↔ (0 + 0) ( Z = 1 ) ↔ (1 + 0) ( Z = 2 ) ↔ (1 + 1) or (0 + 1) 25 = 5 25 3 ( E draw. 4 × 5 they 12 = 5 2 = 2 × 5 2 P 3 + 5 marbles 25 × = of 9 × 5 P number 12 × + 4 1 × + 2 20 × 4 = = Calculate 25 25 5 25 the expected value and variance. 5 2 12 ⎛ ( Var Z ) = ⎜ 0 E E ( Y ( ) = 2X ) E = ( Z E ( ⎞ + 25 4 ⎛ − ⎟ 25 ⎝ d 16 + ⎠ ⎜ 12 ⎞ = ⎟ 5 ⎝ ) X + that Var( Z ) Notice that although expected ) X values, ( 2X ) ≠ Var ( X + X 4 Theorem will generalize Var( X ). we and don’t Z have have conclude equal equal that f or ) random Theorem + Var( X ) X they variances. Thus, Var = Notice 25 ⎠ the conclusion we reached in variables Example in general 2 X ≠ X + X 7: 4: + Given that independent random variables X , X , 1 X 2 , ..., X 3 , n ∈  n 2 all have E (X + equal X 1 expected + ... + X 2 ) = value E( X n and ) + equal E (X variance E ) + ... + E ( X 1 2 ) ( ) X = n μ = + and ( Var + ... + X = σ , then: = n ;     n ) terms and 2 Var ( X + X 1 + ... + X 2 ) = Var ( X n ) + Var ( X 1 ) + ... + Var ( X 2 ) n 2 n When adding expectation n of random the sum variables + (X X  the expectation E (X + X 1 variance of + ... + 2 the X ) + ... + have X 2 variable = which E ( nX ) . ) is the equal expected 2 = nσ terms value, the to n multiplied The by n, variance (nX ), of the i.e. sum is not equal to the n in the case of a linear transformation of 2 Var ( X + 1 Therefore, same 2 = σ + σ + ... + σ      by comparing X + ... + ) = nσ random 2 ≠ n variable, 2 σ = Var ( nX ) . n parameters, X + X + ... + X      n X 2 a ≠ we conclude that for a random variable X: nX . terms Chapter 2 55 Example Anna and Anna’s , , , drawn a 8 Sarah box 2, 2, by Draw and Anna the variables Anna’s amount by of b Draw c Suggest independently 3. the Let and and is Sarah the 3, 4, variables X counters and and 5. Y from Sarah’s two box represent dierent contains the value boxes. the of the counters counters respectively . give of Anna probability a 2, distribution to value money drawing , tables and nd the expected values of the Y going the counters random probability X father multiplied are contains Anna Sarah’s will ear n as in distribution relationship between much drawn this as Let the value random of her drawn variable Z counter represent the game. table the money counter. and nd variables X, the Y, expected and value of the variable Z Z a Notice X = x i  2 3 4 5 1 1 1 1 1 5 5 5 5 5 since P (X = x that we X f ollows have one a unif or m counter in distribution each value. ) i 1 E ( ) X 1 × = ( 1+ 2 + 3 + 4 + 5 ) = 5 Y = × 15 = 3 Calculate the expected value. 5 y i  2 3 1 1 1 There are only three dierent values counters and there are 3 out of P (Y = y ) value “1”, 2 out of on the 6 counters with the 6 counters with the value “2” i 2 3 6 and 1 E ( Y ) 1 = × 1 2 + 1 × 2 + 3 3 = 6 z  i Calculate 2 3 4 1 = z of 6 counters with the value “3”. 1 the expected value. need 2 1 to look at all the possible pairs and 5 their P (Z out 3 We = 1 5 × b Z only 1 products. There outcomes and since are the altogether drawings 11 pairs of are ) i 10 6 15 6 10 independent multiply 6 8 9 0 2 5 1 1 1 1 1 1 give the f or their same probabilities. 15 30 15 30 30 1 : (1, 1) product For 5 56 Expectation algebra and Central Limit Theorem them so we example: 1 × → of probabilities. 1 10 each 1 = 2 10 we have Some have of to to the add pairs their 1 2 *Unsaved 1.1 A B C : (1, 2) or (2, 1) 1 → × 5 D 1 1 1 1/10 2 2 1/6 3 3 2/15 4 4 1/6 5 5 1/10 3 5 : (1, 3) or (3, 1) 6 1 : (2, 2) or (4, 1) 5 C1 3 15 1 × 5 2 = 2 1 + 6 1 × 5 1 × → 1 = 2 1 + × 1 × 5 1 → 5 4 1 + 3 1 = 2 6 =mean(a1:a11, b1:b11) To ( E Z ) simplif y can use a the calculation f or expectation we GDC. = 5 5 Z c = XY Notice that 5 = 3 × 3 ( E Z ) = Exercise 1 Six ( is Explain X of Find d Use and 2 An 3 a why + X the the the = X Find + the is die ) of + the the outcome and the number of is X X value r ule rolled and experiment + + of and to these table of one nd such the coin ip. expected Write value and X outcome on X denote X , can where ipping six variance determine be Y of all written is a as random variable coins. the random possible variable Y outcomes of Y, outcomes. four X variable the + die the times, and outcomes distribution the why the ipped the number of multiples of recorded. variable rolling c ( Y independently whole X probability Explain Y × E variable empirical variance b ) distribution the + comment obtained are expected unbiased Let X denote the representing c ( recorded. X probability = E coins variable variance Y = 2C obtained Let the b ) XY unbiased heads a E X whole + four expected X , table for and one nd roll the of the expected die. Write value and X experiment where Y is a can be written random as variable representing times. value and variance of the random variable Y 2 3 Random variable X has the parameters = 3 a Find the expected value and variance of b Find the expected value and variance of c Show + X that Var ( X X + + 3X ) ≠ Var ( and X + X = + 4. X. 3X. 6X ) . Chapter 2 57 Random 4 variables X and Y have 2 = 2, = 1 X a X b Y c X the + + Y + 4 and 1, 1, 2, Shikma + X + X the a + Y variance + regular Y + of 3 respectively . values of the Y tetrahedral independently of the is die rolling Let a with outcomes faces biased random faces of distribution variables going the this Draw probability of number in is and the die 1, with variables X obtained by 2, 3, and faces and Shariza Y and to X and give concer t tickets and nd the expected Y them obtained. tables as Let many concer t random Shariza and tickets variable Z Shikma as the represent are going to game. the value of probability the Independent 6 X; X friend product b + and respectively . values ear n X rolling 2, Draw the value Y; + is Y + Shikma 1, Their X + represent a + X parameters = 3. = 5, Y expected X Shariza 5 and X Find the 2 variable random distribution Z and table comment variables X and Y and on nd your have the expected result. the expected values 3 and + non-zero A linear In this 2 values we variable. Theorem to If two of independent going star t to look with a ∼ ( ) XY = at the normal μ nd all random par ticular corollar y take Theorem possible that is a case variables of a normal consequence of 2: Coefcients independent normal 2 X E that 2. Corollar y we are We Given μ of combination section random respectively . N( μ , σ 1 random variables cannot both a and be equal 2 ) and Y N( μ ∼ 1 ), , σ 2 then the to linear 0 (we write this 2 2 condition combination The aX expected + bY , value a, and = aμ + bμ 1 and σ b ∈  2 = a 2 + bY ∼ + b N (a μ + bμ Expectation algebra , 2 and a normally of aX + bY , σ so we 2 2 + b 2 σ 1 Central are can 2 σ distributed 2 1 1 58 a 2 σ 2 2 aX is variance 2 μ b ) 2 Limit Theorem write variable. as a 2 + b ≠ 0). Example Given the 9 normal probabilities a b c a X Y +Y  3X X X < of random the variables X ∼ N ( 3, 0.25 ) and Y ∼ N b Y P c X −  0 ∼ − ∼ X 3X < 3X − 2Y P N ( nd the 5; N (3 + 4, 0.25 + 0.64 ) = N ( 7, 0.89 2 Y > 8 N ( ) 4 = ) Apply the theorem − 3, ≤ 0.5 ) ⇒ 3X (1, 3X ∼ N 4.97 − 2 Y by above using a and calculate the GDC. 0.145 = 0.64 + 0.25 ) = N (1, 0.89 − 2 Y (3 ) 0.298 < 0 2 = , > 8; + Y X ( Y ) 2 Y + Y ( 0.64 following: probabilities P ( 4, × 3 − 2 × 4, 3 2 × 0.25 + 2 × 0.64 ) ) < ) 0 = 0 327 Scratchpad normCdf(8, 100, 7, √0.89) 0.144573 normCdf(–100, 0.5, 1, √0.89) 0.298056 normCdf(–100, 0, 1, √4.97) 0.326874 3/99 Notice lower +∞ that in part boundaries and −∞ a the are upper −00 boundar y which are is 00 sucient and in values parts b that and c the represent respectively . Chapter 2 59 Example There are course, nal 5.2 0 two whilst grades and of of Find a school. class classes deviation .85. the at other both standard deviation class. classes the The HL probability One studies are of class the distributed .37, class that the normally , whilst claims this studies Mathematics the that SL they statement is Mathematics Standard with class have the has a HL a (SL) class mean obtained Higher Level of Level course. having 4.8 better a and result (HL) The mean of standard than the SL tr ue. 2 X ∼ N(5.2, 1.37 ) Denote random 2 Y ∼ N ( 4.8, 1.85 > Y ⇒ X −Y two = P −Y ∼ ( X variables SL X classes and Y by the parameters of the di erence random variables. + 1.85 ) 5.2294 ) −Y > 0 ) = 0 569 Scratchpad normCdf(0, 100, 0.4, √5.2294) Since the conclude result probability that than the the is HL SL greater class than have 0.5 we obtained 0.569428 can a better class. 1/99 The corollar y normal to random Corollar y to If n we take Theorem 3 can  N (μ 1 Theorem be generalized for n independent 3: independent , σ 1 now variables. normal 2 X random variables 2 ), X 1  N (μ 2 , σ 2 2 ), ..., X N (μ  2 n , σ n + ), n ∈ n n n 2 then the linear combination of a X 1 + a 1 X 2 + ... + a X n 2 = n ∑ a X k , k a ∈ , k k =1 a normal variable. The parameters are μ = ∑ a μ k and σ 2 = k k =1 n n we can write ∑ a X k ∼ N ∑ k k =1 The It 60 is proof left as of an Expectation this k , k k =1 theorem exercise algebra μ a for and can the be ∑ k k k =1 done by mathematical student. Central 2 σ a Limit Theorem ∑ k =1 n 2 so a n 2 also ∑ k =1 n is the respectively. 2 N (5.2 − 4.8, 1.37 N ( 0.4, and > 0 2 X HL ) Find X the induction. 2 σ a k k , ≠ k 0 of Example Given Z the ∼ N( ( a P b P c P d P a X ( ( (  X following 3, 0 .4 ) , −Y − 2X − 2X + Y X Z + 3 Y −Y − Z and W Z < −3 W < 2W random ∼ N ( 0 .5, variables 0 .64 ) X calculate ∼ N ( 2, the ∼ N ( 4 , 1.44 ), 2.25 ), Y following: ) ; > 0 − 4W > normal ) 0 − Z ) ; ) −Y ∼ N (2 ; − 4 + 3, 2 .25 + 1 .44 + 0 .4 ) Find the = N (1, b ( X 2X + Y − Z + Z ) > 0 = 0 − 4W c P ( ( 2X − 2X + Y  Y ∼ N(4  d P 4 × 2 .25 + 0 .4 + 16 × 0 .64 ) Z − 4W > −3 Z  3Z + 4 < 0 ) = −W ) 0 ( ( 2X X + Y + 3 Y + 3 Y the the the a of GDC probability. new nd mean and variable the = P ( 2X + Y + 3Z + W > 0 variance and use a of GDC probability. ) Rewrite  W − 9 + 0 .5, 4 × 2 .25 + 1 .44 + 9 × 0 .4 + > < −3 Z 2W − 2W −W −Y ) ) = = P 0 ( the have one then nd + 12 − + of the new to nd the 2 .25 X + + 3 Y − 2W 9 × 1 .44 + Y < 0 X + 3 Y < 2W Notice ) = 0 and variable. and variance Use a GDC probability. that + 4 × 0 .64 + we expression 3Y haven’t because, + Y ≠ simplied f or 4Y random since the 1 .44 ) and −Y variable mean we ) values N (17, 19 .21) ( that 448 the 4, the so 0 .64 ) +Y 1 inequality random variables, ∼ N (2 P use 130 N ( −0 .5, 14 .68 ) X = variance and N (5, 19 .64 ) 2X P nd Find to P and 690 ∼ N ( 2 × 2 − ( −3 ) − 4 × 0 .5, = mean variable 4 .09 ) to P the new of 4Y the are variances of 3Y + Y di erent. 0000525 Scratchpad normCdf(0, 100, 1, 4.09) 0.689512 normCdf(–100, 0, 5, √19.64) 0.129611 normCdf(0, 100, –0.5, √14.68) 0.448086 normCdf(–100, 0, 17, 0.000053 √19.21) 4/99 Chapter 2 61 Example Bill, Jill, order: the  and John X-large, are Large, parameters are going and given to a steakhouse. Small. in the The following Average X-Large 430 g 30 g Large 35 g 22 g Small 50 g 8 g orders Find the one X-Large probability weight steak, that Bill 2 X P ∼ N ( 430, ( X > L + S 30 ) orders get a P L ( ∼ N (315, L + S − X 22 < 0 + S − X one ∼ N (315 + 150 − 430, three the steak steak and are than John Jill’s S ∼ N (150, 8 ) Denote L + S − X ) = 0 John’s the weights Small steaks and respectively. 2 22 + they can normally and one Small steaks put steak. together. by of X-large, random Large, variables X, and S L, 2 8 + 30 ) and < 0 dishes ) N (35, 1448 ) ( steak orders and the inequality construct parameters P of distributed 2 ), Rewrite = types steaks deviation Large heavier 2 L are all 2 ), = Standard Jill will of table. Steak Bill There weights a of new the in a simpler variable. new Find f or m the variable. 179 Scratchpad normCdf(–1000, 0, 35, Example Ever y orders more The 1448) 0.178844|  day two one time wrestlers Large to weights pizza nish of all his Kenji and pizza, sizes of and Kazuyoshi Kazuyoshi Kenji pizza orders gets are go hungr y normally for one a snack Jumbo again at a pizza. and distributed. pizza Since orders Large one place. Kenji Kazuyoshi small pizzas always needs pizza. have a mean of 2 900 g and a variance of 25 g , and Jumbo pizzas are .5 times the weight of Large pizzas. 2 Small pizzas have a mean a Find the mean and the b Find the probability of 440 g variance that on a that Kenji and of a variance the given weights day Kenji of 0 g of Jumbo will eat . pizzas. more pizza by weight than Kazuyoshi. c Find a 62 the three probability day Expectation will eat period. algebra and Central Limit Theorem more pizza by weight than Kazuyoshi during Let a L, J, and S represent the random variables of Use the weights of Large, Jumbo, and Small 2 J = 1.5 × L ⇒ J ∼ N (1.5 × the E( aX ) 900, 1.5 × f or mulas: pizzas. = a E( X ) 25) 2 = D b N (1350, = L + S Var( aX ) 56 .25 ) − J ⇒ Let 2 D = P c ∼ N (900 + N ( −10, ( D > 0 D + 440 − 1 .5 × 900, ) = ∼ N (3 × −10, > 0 the the daily weight of di erence pizza that Kenji × 25 ) and Kazuyoshi nd the eat. ) + = D 0 probability Use that a GDC D > to 0. 148 X X D 0 = ( Var( X ) represent between 25 + 10 + 1 .5 a 91 .25 ) X P D = ⇒ Let 3 × 91 .25 ) = N ( −30, the 273 .75 ) X represent weights 0349 the during di erence a three between day period. Scratchpad normCdf(0, 1000, –10, 91.25) 0.147585 normCdf(0, 1000, –30, √273.75) 0.034901 Exercise 1 Given X 2D the ∼ N ( 0, 1), Y calculate 2 (Y a P c P(3 X e P( X A the − Z − 4Z standard Z ∼ N( 2, random 0 .25 ),  < 0 Z   3X W be ) − 2Z ) sells to ) beetroot normally deviations are ( b P d P( X f P(W and X sweet distributed given in the  Y  Y and table 240 g 20 g Sweet 730 g 50 g the Find probability potato chosen ∼ N (3, 1 .21) is that more the than weight three  of times  W 2Y  2Y  W ) )  3 W The the  0 ) weights, mean in values grams, and below . Beetroot Find variables and W Z potato. Standard potato   3Z Mean sweet b 0.16 ) , normal following: stall assumed independent ∼ N (1 , − W  Y market are a following deviation a randomly the weight of chosen a randomly beetroot. the potatoes probability and four that the randomly weight chosen of two randomly beetroots will chosen exceed sweet 2.5 kg. Chapter 2 63 3 Lillian the and same hospital. Veronica to the Veronica walks hospital inde pendent values and of in Lillian and are live the always takes the assumed each standard same other. takes tube. to In be of Lillian 35 5 Veronica 45 8 less the than Given b probability 30 that minutes to Veronica probability hours that that travelling get a work a the ve week (not work at and a but bus to travel distributed are times in the and mean minutes. deviation given to works within to on and taken below these Standard Find tram times table Mean a a The nor mally the deviations building day Lillian will take hospital. days she a week, will nd spend including her the more than jour ney 4 home again). Find c the the time than probability taken three by times that, Lillian the on to time a given get to taken day , the by four times hospital Veronica to is more get to the Statistical hospital. ver y impor tant researchers 4 Dominic grows apples on his farm. It may be assumed of the apples are normally distributed with a in 255 grams and a standard deviation of 12 in the probability that a randomly chosen apple than 250 have of these apples are selected at random. Find in to that than 1 the total weight of the four apples also kilogram. grows the the plums are plums. normally and a standard It may be assumed that Find the distributed with a mean of deviation of 2 probability that the is more than four is the if it is the of a randomly weight of a not probability that the weight of a randomly then is more than the weight of four randomly chosen plums. since be if distribution of the mean core course we studied population and a a population sample 64 is In Expectation sample includes all the members of a just a representative selection dened or order algebra to be and able Central to predict subset Limit something Theorem were We to it the be technique, type must other about to of take factors in such said the duration of the group of tested. comes the researchers samples. from entire when concerns and the resources population. results the sampling ethical and be selecting as that the population However , Sampling the the from different account In meaningful cannot chosen results 2.2 the plum. the apple of randomly will Find d the chosen obtained chosen of grams. weight times these 60 drawn, apple they small weights conclusions c of representativeness population grams a population. concern representative of study is sample: Dominic a select from main researchers greater draw the The probability to entire grams. sample Four b the weighs may more order for grams. population Find a for many mean conclusions of is that disciplines; weights sampling the available. study, population from is where to a sample be We are The in selected the For there The is a in mean Let’s a the ideas it of to the sample to be is in of of sample. has the taken of we is not nails an A equal of a of random sample oppor tunity a normal have of check simple is are mean so in far role for one, we a in nails. within sample of the a the standard can consider variables. the it and generally so since acceptable. and random nails learned to Taking independently , weights impor tant manufactured nails the population. applications. range sample. change a individually , weight take an decides weights population can plays nails the from practical nails what independent what sampling lot all doesn’t deviation ourselves a weigh decide population sample of checking entire a random population distribution has variation to have the sample. manufacturer The to of manufacturer needs standard remind as process whether large the and the some impractical population. nails of sampling, The need probability considerable range, from par t suppose ver y consider size of we element develop manufacturer weight the a be to nails. be sample ever y normal example, irregular To of theor y would to going study a correct certain deviation the However, we have independent random X variables , X 1 what , ..., the study of expectation X N (μ , σ  algebra. + ), n ∈  , then n n n ⎛ E is of sample? the 2 of weights 2 If xed ⎞ ⎜ ∑ ⎝ i =1 ⎛ = X i nμ and ⎞ Var ⎟ ⎜ ⎠ ⎝ ∑ i 2 nσ = X i ⎟ ⎠ =1 − Now , however, random We’ll we’ll variable look at consider within the a sample a variable X random mean sample and its which of is an size k, independent where k ≤ normal n. parameters. n ⎛ ⎞ ∑ ⎜ X i ⎟ n 1 E i ( ) X ⎛ 1 ⎞ i =1 = E ⎜ ⎟ n ⎝ = E n ⎠ ⎜ ∑ ⎝ i =1 X ⎟ i × = nμ μ = n ⎠ n ⎛ ⎞ ∑ ⎜ X i ⎟ ( X ) 2 n 1 Var ii ⎛ σ 1 ⎞ 2 i =1 = Var ⎜ ⎟ = Var 2 n ⎝ n ⎠ ⎜ ∑ ⎝ i =1 = X i × ⎟ n σ = 2 n n ⎠ 2 If the population distribution of variance the sample is σ and mean of samples these of size samples n are will taken, have the then same the mean 2 σ value as the population and the variance n 2 ⎛ σ 2 X ∼ N( μ , σ ) ⇒ X ∼ N ⎜ μ, ⎝ n σ The term standard er ror is used to describe the quantity , which is the average n distance lie. By from the looking smaller the at mean the standard that each formula error we we random notice variable that the within larger the the sample sample we of size n take will the obtain. Chapter 2 65 Example The a weight mean of Find a The of in and a a hand 20 sh farm standard probability than catch trout 340 g the more b  that a may be deviation catch of 0 assumed of to be normally distributed with 30 g. trout will have a mean weight per sh of 370 g. net can trout in hold the up to hand 7 kg. net Find without the probability breaking that we will be able to it. 2 X a ∼ N (340, 30 ) ⇒ Use the f or mula f or nding the mean and 2 2 ⎛ X ∼ N ⎜ ⎛ ⎞ 30 340, 10 ⎝ 30 ⎛ ⎞ er ror of the sample distribution. N ⎜ 340, = ⎟ standard ⎠ ⎜ ⎜ ⎝ ⎝ ⎟ 10 ⎠ Scratchpad 30 normCdf ( ) 370, 1000, 340, √10 0.000783 P ( X > 370 ) = 0 000783 1/99 Method b I 2 ⎛ 30 ⎛ Y ∼ N ⎜ 340, ⎜ ⎜ ⎝ ⎝ ⎜ ⎟ 20 ⎠ 7000 ⎞ ⎛ P ⎞ Y < ⎟ 20 ⎝ = P ( Y < 350 ) = 0 Use 932 the GDC to nd the solution. ⎠ *Unsaved Method II 30 normCdf 2 ⎛ Y ⎟ ⎜ ⎝ ⎝ 0.931981 ) –100, 350, 340, √20 ⎞ ∼ N ⎜ 340, ⎜ ( ⎞ 30 ⎛ 20 ⎟ ⎟ ⎠ ⎠ ⇒ T = 20Y normCdf(–100, 7000, 6800, 30 • √20) 0.931981 2 ⇒ T ∼ N ( 6800, (30 20 ) ) 2/99 ⇒ P ( T Example In a nail mean 4 factor y , and a Find b By the = 0 932  value nails ) < 7000 of 8.2 g the the the weights and mean the samples of empirical of 4 all manufactured standard weight parameters using a of is the deviation of nails follow 0.02 g. a normal Suppose that a distribution control with scale weighs recorded. sample r ule, nd mean the distribution. inter val of weights that would be acceptable for all nails. 2 a X ∼ N (8 .2, 0 .02 ) ⇒ Use the f or mula f or nding the mean and 2 2 ⎛ X ∼ N ⎜ 0.02 8.2, ⎝ b 8.2 ± 3 × ⎟ 4 0.01 = ⎛ ⎞ = N ⎜ 8.2, ⎠ 8. 2 ⎛ 0.02 ⎞ ⎜ ⎟ 2 ⎝ ⎝ ± standard er ror a 66 accept mean all the Find 0.03 weight Expectation samples between algebra and of 4 8.7 g Central nails and Limit of the sample distribution. ⎠ the values deviations We a that have 8.23 g. Theorem which above and are 3 below standard the mean. The fall empirical within Example rule the in range statistics of three states standard that almost deviations all (99.7%) from the the mean obser vations value.  Manufactured screws follow a normal distribution with a mean length of 3 cm and a 2 variance the of sample . 0.04 cm will lie What within sample 0. cm of size the should be taken population to mean be 99% cer tain that the mean of length? 2 X ∼ N (3, X ∼ N 0 .04 ) ⇒ Use 2 ⎞ 3, ⎜ ⎟ n ⎝ P( 2 .9 ⎛ 0.04 ⎛ ≤ X Method ≤ ⎛ = f or mula standard er ror f or of nding the the sample distribution. = ⎜ ⎜ ⎝ ⎝ ⎟ n ⎠ 0 .99 Transf or m the variable into the 2 9 3 3 ≤ Z 1 3 ⎞ nor mal = ≤ ⎜ 0 variable Z = 99 σ ⎟ 0 2 0 2 ⎜ ⎟ ⎜ ⎟ n ⎝ n Notice that the proper ty ⎛ 0 Z ⎜ 1 n ≤ the boundaries are symmetrical ⎠ about P by standard μ X ⎛ and 0.2 ⎞ I P mean N ⎜ 3, ⎠ 3 .1) the mean of the value so standard we can apply nor mal the cur ve. α ⎞ ⎟ = 0 P( −a 995 < Z < a ) = 1 −α ⇒ P( Z < a ) = 1 − 2 0 ⎝ 2 ⎠ Apply the inverse function. n 1 = ( Φ 0 995 ) 2 n = 2 × 2. 57583 ⇒ n = 5. 15166 integer 2 n = 5.15166 Notice = 26.5396 ≈ that the value sample theref ore size we must round be it a to positive 27. 27 Find the solution by using the function Scratchpad f eatures y on your GDC. 1 0.2 f1(x)=normCdf 0.5 2.9, 3.1, 3, ( √x ) –0.99 (26.5, 0) x 0 5 10 15 20 25 30 –1 n = 27 Chapter 2 67 Exercise 2E 2 1 Given that ∼ N( μ , σ X population, and ) the sample size n is taken from the nd: 2 ( a P 1 ≤ b P(|X c P(|X | ≥ ≤ 3 X ) = 1.5, if = 4, n = 10; = 9, n = 7; 2 − 5 ≤ 3) if = 5, 2 2 Sleeping 0.8) habits distributed standard were 3 A 4 less the mean ( Find P b Find the of the The Limit Moivre. extended Bessel the 35 mean hours. Find 5 is and 15 are value of students the found 5.5 to hours from probability taken the that in mean a the from the standard total large value The Almost of of the that be and the university on normal deviation sample shopping of 62.5 kg led 12 states that an 12 a value centre and the maximum average people will that people Limit a not the will average they distribution 6. is are more than 180. distributed standard deviation weight exceed total not load of 800 kg. randomly 70 kg. weight exceed a of of the the randomly maximum load to Expectation one centur y work. due Theorem mathematicians of later the the rst Later of including Central of the Limit worked about Laplace on, von Mises, by CLT in in was Theorem in published Central all of Pólya it and Poisson, named theor y. Lindeberg, and general Markov, setting. and the probability Chebyshev, a the Abraham Dirichlet, George CLT on theorems it Cauchy, the contributions publication have impor tant write period impor tance Finally, and to this contributions. to CLT . algebra most Pierre-Simon During mathematicians the the mathematician impor tant on many (CLT), rst Moivre’ s other working that probability histor y “central” Lyapunov label group Theorem de Pólya, a Central made theorem 68 university elevator. mathematics. were a = 11 ) the group Throughout with the size probability Calculate selected de of buyers has the selected 2.3 ≥ 37 with elevator Find b 1.2 at n hours. value of = 8, 18.5 kg. An a of probability weights normally of X with random. 5 sample a The at than −0.2, students deviation random with of normally selected sleep = if Along Lévy and So far let’s we take have an been spreadsheet, we calculated the of four a set mean of of each repeated taking example the have mean simulated throws set of 4 B samples doesn’t value process A that of (shown ve D four in times E A We (shown in die have below), column G populations. distribution. tetrahedral throws. in normal normal a column (shown more from a throwing the throws only follow B columns H times, and performed and have J D − 0 have then repetitions calculated We time a four below). This Using the then Q). K M N P 1 80 3 81 4 1 2 3 3 2 82 2 2 4 4 1 4 83 4 3 2 4 2 3 84 3 3 1 1 3 4 85 4 2 2 1 1 4 86 4 4 2 1 3 3 87 1 3 1 1 2 2 88 4 4 1 3 2 1 89 1 2 1 1 2 90 2 91 2 2 4 3 4 3 92 4 2 3 1 2 4 93 3 4 4 2 2 2 94 1 1 1 3 4 4 95 2 4 3 4 2 1 96 2 4 3 3 4 1 97 1 4 3 2 4 3 98 1 1 2 4 3 1 99 4 4 2 1 1 100 2 2,2 1 2,7 2 2,7 2 2,4 2 2,8 2 2,3 101 mean= 2,53 mean= 2,56 mean= 2,46 mean= 2,54 mean= 2,62 mean= 2,44 102 sd= 0,275076 sd= 0,359629 sd= 0,392145 sd= 0,440202 sd= 0,285968 sd= 0,295146 We X = 1 that a a  x variable die 2 has 3 X the 3 3 1 2,7 random tetrahedral 3 2,4 3 2,9 know throwing 3 4 2,6 4 Q 1 1 79 2 1,7 that describes following the 3 2,8 outcomes 3 2,8 2 4 2 probability 2,7 2,3 3 2,8 2 when distribution. 4 i P( X = x 1 1 1 1 4 4 4 4 ) i 1 We can see that E ( X ) 5 ( 1+ 2 + 3 + 4 = ) = = 4 2 and 5 2 2 1 Var ( X ) = (1 + 4 + 9 + 16 ) − 4 In each are in in the group the of repetition the range range four [0.317324 , sample mean ⎜ ⎟ ⎝ 2 ⎠ 5 of X, we = 1 12 2 notice the is that standard The expected which standard = 4 throws the 5 ⇒ σ = 0 .643342 ]. around deviation 5 ⎞ 2 .675] and [2.175, themselves standard the of ⎛ mean value .2, sample deviation values of is the X, not of even also notice that the standard error is n = 4 ⇒ this value is in the are sample 2.5, the mean while range obtained range [0.317324 , 1.118033989 = n and in is values deviations. σ We values the which mean = 0. 559017 2 0 .643342 ]. Chapter 2 69 We are We can n 10: = going see to the A repeat results B in simulation the D two E for the sample spreadsheets G of n = 10 and n = 50. below . H 4 size J K 3 M 4 N P 1 Q 1 79 1 80 3 81 4 1 2 3 3 2 82 2 2 4 4 1 4 83 4 3 2 1 2 3 84 3 3 1 1 3 4 85 4 2 2 1 1 4 86 4 4 2 1 3 3 87 1 3 1 3 2 2 88 4 4 1 1 2 1 89 1 2 1 2 2 90 2 91 2 2 4 1 4 3 92 4 2 3 2 2 4 93 3 4 4 3 2 2 94 1 1 1 4 4 4 95 2 4 3 3 2 1 96 2 4 3 2 4 1 97 1 4 3 4 4 3 98 1 1 2 1 3 1 99 4 4 2 2 1 100 2 2,2 1 2,7 2 2,7 2 2,4 2 2,8 2 2,3 101 mean= 2,53 mean= 2,56 mean= 2,46 mean= 2,54 mean= 2,62 mean= 2,44 102 s.d.= 0,275076 s.d.= 0,359629 s.d.= 0,392145 s.d.= 0,440202 s.d.= 0,285968 s.d.= In each lies repetition a = 10 range mean ⇒ for 1 n = ten of standard throws, of This four 3 1,7 notice 2.62]. repetitions 3 1 2,7 [2.44, that is the sample The 3 2 3 narrower throws. 2,8 2,7 3 2,8 2 2,3 4 2,8 3 2 0,295146 mean than the standard range error of the is 118033989 = n the of 3 2,4 3 2,9 σ n 1 2,6 within sample = 0 35355, which is in the range of 10 deviations. [0.275076, 0 .440202 ] . 50: A 70 the B D E G H J K M N P Q 481 4 4 2 2 1 4 482 1 4 1 3 2 2 483 2 3 3 4 3 3 484 2 3 2 3 3 3 485 2 4 4 2 4 3 486 1 1 1 3 3 3 487 3 3 1 1 2 4 488 1 1 2 3 1 3 489 1 4 1 4 1 4 490 1 3 4 3 3 2 491 1 2 1 3 4 2 492 2 4 3 3 3 1 493 4 1 3 1 1 4 494 2 2 2 2 3 3 495 3 1 1 1 4 1 496 3 2 4 2 2 3 497 4 2 2 4 1 4 498 2 4 1 2 1 2 499 1 3 1 4 2 500 1 2,28 2 2,4 3 2,34 1 2,5 1 2,42 4 501 mean= 2,514 mean= 2,552 mean= 2,442 mean= 2,508 mean= 2,444 mean= 2,516 502 s.d.= 0,224707 s.d.= 0,100311 0,114095 s.d.= 0,154402 s.d.= 0,101893 s.d.= 0,116924 Expectation algebra and Central s.d.= Limit Theorem 1 2,46 In each lies repetition within sample an of even mean for 50 throws narrower repetitions notice that of four throws σ error for n = 50 is by n given the sample range [2.442, 2 .552 ] than = 50 or 1 ⇒ ten Our the is in the range conclusion closer the population is of that the the parameters mean and the standard larger of the the The of the standard 118033989 = 0 15811 , 50 deviations [0.100311 , 0 .224707 ]. number sample standard range throws. = n which mean the of mean samples are we getting to take, the error. 2 X is a Then random n as variable → ∞ the with sample the parameters mean ( E distribution ) X = μ and approaches Var the ( X ) = σ . normal 2 ⎛ distribution X ∼ N ⎜ σ μ, n ⎝ This any be is one of the probability a large The of mean samples The ● the of With can a be (discrete taking means the and of a or the is most amazing continuous) large all samples also number samples ver y close of and to take those taken. the facts in from samples W e should mean mathematics. of it a we sample look notice the at the Suppose of the size n (n take must statistical following population we that results: we took the from. population ● After standard sample important distribution number). characteristics ● most deviation divided size n, large the by the the of samples factor standard number better of that is is propor tional close to the to value the of standard the deviation square root of of the the error. samples approximated by a of size n, normal the distribution of the means of each sample distribution. The simulation of the y 3 f(x) sampling n = process 50 Central 2 can be distribution and the Limit Theorem found at http://onlinestatbook. 1 n = 10 com/stat_sim/ n = 4 sampling_dist/ 0 x 1 2 3 4 This 5 website both predened uniform, Theorem 5: The Central Limit sampling from any normal, and Theorem skewed When has population X with the nite parameters in distributions, addition to the 2 and , the distribution of the sample mean X is approximately capacity to Normal if the sample size n is large enough. The mean of for create students their distributions, distribution of the sample mean is equal to the population the variance of the distribution of the sample mean is a probability the distribution variance of the parent population divided by the sample by mean graphing and own the function size, and then applying the 2 ⎛ σ process X ∼ N ⎜ ⎝ of sampling μ, n distribution on it. Chapter 2 71 When we restrict say ourselves population such Example From girls a age Find 5 of the as taking a samples normal mean from distributed data collected who live 6 cm. in There probability and Find 65 c to we that a we cer tain are not type going to of population. a in a school large are town two over have spor ts a number of years, a mean height teams made up of of 5 we can 66 cm girls and say and 8 a that standard girls within that group. 65 b population”  medical of deviation age “any the probability and What that the mean height of the team with 5 girls is between that the mean height of the team with 8 girls is between 72 cm. 72 cm. can you conclude about those two teams? 2 ⎛ X a ∼ N ⎜ 6 the parameters of the sample mean. 5 ⎝ P Find 166, ( 165 ≤ X ≤ 172 ) = 0 Use 633 a GDC to calculate the probability. 2 ⎛ X b ∼ N ⎜ 6 Find 166, P(165 ≤ the parameters of the sample mean. 8 ⎝ X ≤ 172 ) = 0 .679 Scratchpad 6 normCdf ( 0.632632 ) 165, 172, 166, √5 6 normCdf ( 0.678985 ) 165, 172, 166, √8 c With more higher probability height is overall T o closer help weight char ts assess in 72 to the children. a growth other consist illustrate to of a the series This a child whether there Expectation valuable is might be algebra team’s to is at of same percentile can an child’ s doctors Limit height that measurements appropriate Central use Growth cur ves body help a doctors age. determine rate or Theorem Notice team the problems. and a mean height compare the selected tool growing the there development, of of of team, girls. char ts children a mean of child’ s distribution whether in that population percentile and members the that then same if we we get inter val have a more larger of members probability heights. of a over Example 8 2 A population from the with the Find the probability b Find the minimum and = 1 is given. We take a sample of a size n ∼ N ⎜ P(1 .5 ≤ sample X ≤ 2 .5 ) size n given such that n = P(1 .5 ≤ that 20. X ≤ 2 .5 ) is at least 0.9. 1 ⎛ X 2 population. a a = parameters Find the parameters of the sample mean. 2, 20 ⎝ Scratchpad P(1 .5 ≤ X ≤ 2 .5 ) = 0 .975 1 normCdf ( 0.974653 ) 1.5, 2.5, 2, √20 1 ⎛ X b ∼ N ⎜ 2, 1.1 Scratchpad n ⎝ y P(1 .5 ≤ X ≤ 2 .5 ) = 0 .9 ⇒ n = 11 3 1 f1(x)=normCdf 1.5 1.5, 2.5, 2, ( x ) –0.9 (10.8, 0) x 0 5 10 15 20 –1.5 Round Exercise the value of n to the rst larger integer. 2F If the population 2 1 Given size n the population and parameters and the nd: sample distribution symmetrical, smaller is then sample even we are with a going 2 a P(1 .5 ≤ X ≤ 2 .5 ) = if 2, = 9, n = 30; to P(1 .25 ≤ X ≤ 1 .35 ) = 1.3, if a better approximation 2 b achieve = 0.04, n to a normal = 50; distribution. When the 2 c P( X ≥ −0 .48 ) = if −0.5, = 1 , n = 100; population distribution symmetrical then to is not achieve 2 d P( X < 397 ) if = 400, = 234, n = 85; a good normal 1 ⎞ ⎛ e P ⎜ X ⎝ < ⎟ ⎜ ⎝ we to a need = 1 , = 6, n = a 40; larger sample. Usually we say 2 ⎠ 1 ⎛ P distribution 2 if that f approximation X − 2 need a sample size 2 if ≥ we = 1.9, = 36, n = 120 . of at least n = 30 to achieve 5 normality of the sampling distribution. Chapter 2 73 2 A in real a estate city €8,000. 3 agent have a Given nd the The gestation a a average than b the 64 the 2 60 the average has days. probability gestation a mean apar tments standard were price Three studio and apar tments dogs 2 €30,000 of doesn’t value female deviation selected at exceed of dogs 63 €32,000. days are of random and selected during that: period of selected dogs will last longer gestation period of selected dogs will last less period of selected dogs will last within days; average days for prices days; average than c Find the of the of studio that period that value 15 deviation period. the mean that probability standard this claims of gestation the expected value. 2 4 A population We take a sample the of size Find the probability b Find the minimum Svetlana jumps and normally ≤ sample a of X 4 is given. population. ≤ 7 .6 ) size n given such that n P(5 ≤ that the with distributed a that average of mean jumps made probability specializing deviation Their Blanka than athletes normally standard 9 cm. competition the are are distributed deviation greater P(5 .4 the = = X 10. ≤ 7) is at QUESTIONS and Blanka’s Find from and exercise EXAM-STYLE 2.02 m n 6 0.95. Review Blanka = parameters a least 1 with ve the average of a the Svetlana’s of and high mean 1.98 m independent. jumps height with 4 cm. value are in height of Svetlana’s value jumps and a made Blanka’s of are standard During Svetlana jump. a four jumps. jumps was jumps. 2 2 Let X ∼ Po( m ) a Show b Hence Another that Find Let 74 Find e State = such P( X the ≥ variable Var Z and reason algebra (2 X ) = ( E ( X )) − 5 Y, ( that 3 Y ) is independent of X, has a Poisson = 18  5) mean a Var 6) that variable with Expectation  Y that 5. P( X random random d m nd distribution c such and be such that variance of whether or Central Limit Z = 3X  4Y . Z not Z has Theorem a Poisson distribution. 3 A of sh shop these values and three may be standard types Standard Bass 320 2 Bream 400 20 Cod 350 5 a Find the probability b Find the probability c Find d Calculate and a A the b A that that bass parameters the bream random Given one is random n the and of less than = 27 X Z in bream, grams, weight one the total twice and cod. distributed. are follows a one of two and weight the the the of bream that all bass, normally weight follows nd variable be given in The The the weights mean table below . deviation the probability variable that sh: to deviations, Mean buys of assumed Fish Nicholas 4 sells sh total a cod is Alex of sh of binomial less of one 670 g. cod. has Nicholas’ bought. bass cod. distribution of 450 g. than Nicholas Alex’s values Poisson exceeds buys weight weight possible bream E ( 3X distribution Var ( X ) and 7 = 6. ) and 2 ( Var ( Z ) ) Prices mean a A of value car is Given the of ) + 12 2Z family €12,000 selected at ) and a cars adver tised standard random. Find on a deviation the website of have a €3,200. probability that the price €13,000. that 30 cars are that selected the at average random price of from those the 30 website, cars will nd be less €11,500. Vladimir spend is buying between randomly that Z 5 + probability that c ( ( second-hand exceeds b E Var Calculate 5 = an second €10,500 select average a in and order price to falls hand family €12,500. have within the the car How and many probability desirable he is cars of at willing should least to he 85% range? 2 6 a Let X be a random variable. 2 show b that Given where two p E (X = 1, expanding the expression E ( X E ( X )) 2 ) ≥ (E ( X )) independent + q By show random that Var ( variables X + Y ) = such E ( X that + Y X ∼ Geo( p ) ) ( ( E X + Y ) and Y ∼ Geo( q ), 3) Chapter 2 75 Chapter Linear that and b X ∈ R E(aX i) summary transformation Given a,  + is a random of a variable single with variable nite parameters then b) = aE(X ) + b 2 Var(aX ii) Linear Given + that E(aX = a Var(X ) transformation X parameters i) b) + and and bY ) Y a, = are b two ∈ R of Var(aX + bY ) Independent random variables with nite then aE(X ) = variables independent + bE (Y ) 2 ii) two 2 a Var(X ) random + b Var (Y ) variables + Given that independent random variables X , X  , X 2 , …, X 3 , n ∈ Z all have equal n 2 expected value and equal variance, E ( X ) μ = and Var ( X ) = , σ then 2 E( X + X 1 + ... + X 2 ) nμ = Var ( X and Independent + X 1 n normal random + ... + X 2 ) nσ = n variables 2 If we take two independent normal random variables X ∼ N (μ , σ 1 then the linear combination aX + bY , a, b ∈  The parameters = are a + b and 1 2 aX + bY ∼ N ( aμ bμ + 1 Sampling , = 2 2 + b 1 to be a and Y ∼ N (μ 2 normal variable also. 2 b 1 2 + going 2 a 2 σ a 2 is 2 2 2 ) 1 , so we can write 2 2 σ ) 2 Distribution of the Mean 2 σ E ( X ) = μ and Var ( X ) = , Normal population sample n 2 ⎛ σ 2 X ∼ N (μ, σ ) ⇒ X ∼ N ⎜ μ, ⎝ n σ The term standard deviation of the mean, is n 76 Expectation algebra and Central Limit Theorem also known as the standard er ror. , σ ) 2 The Central Limit Theorem 2 When the distribution size to sampling n the is large from of any the enough. population population sample The mean mean mean and the of X X with is the the nite approximately distribution variance of the of and σ parameters Normal the if sample distribution of the the , sample means is sample equal means is the 2 ⎛ variance of the parent population divided by the sample size, X ∼ N ⎜ ⎝ σ μ, . n Chapter 2 77 Exploring statistical 3 analysis methods CHAPTER 7.3 OBJECTIVES: Unbiased estimators based variances; on and estimates; comparison of unbiased estimators n X i X as unbiased estimator μ. for X = n ( X = X i 2 n Condence 7.6 Null and critical inter vals alternative for the mean hypotheses values, p-values, including calculations a population. normal Before the you variance of the table, of H one-tailed of nd continuous a normal and H their . population. Signicance and two-tailed probabilities. tests. T esting x < 0 3 0 ≤ x < 20 2 20 ≤ x < 30 2 30 ≤ x < 40 8 40 ≤ x < 50 6 Answers obtained that P (X the mean 1 variable X Given and P (X c P (1 T ype I and the following II for regions, errors, the variance of the table, mean nd variable the = X 5) ~ = ≤ 6) = X by the B(10, ≤ x < 2 22 2 ≤ x < 4 37 4 ≤ x < 6 46 6 ≤ x < 8 5 GDC. 0.3) nd: 2 Given that X ~ B(n, p) nd: 1 0.103; P (X = 5) if n = 5 and p = ; 2 0.989; 1 ≤ X ≤ 3) = 0.62. b P (3 ≤ X < 8) if n = 10 and p = 5 3 Given that Y < Po (2) nd: 3 78 a P (Y b P (2 c P (Y = ≤ ≥ Exploring 0) Y = ≤ 4) = Given that Y < Po (m) nd: 0.135; 5) = a P (Y b P (3 = 0) if m = 0.4; 0.577; 0.143. statistical of mean Frequency 0 a b Critical hypotheses X ≤ a level. 1 Frequency 0 Given . start following X 2 σ for ) 0 and estimator 1 7.5 Given unbiased ∑ i =1 1 2 as n i =1 S 2 S ∑ analysis methods ≤ Y < 8) if m = 7. Biased information: Statistics are proposals, television often and regret on. later control Statistical In we work particular calculated that be from which to an the quality in use; from used the into to sense add can this put statistical making by data? to theories, paying forward analysis. decisions about of credibility see numbers lear ning age, to pull this is calculators what is of two is only Y ou balanced you data in that statistics and that some sampling tells data its useful is attention to through They you a can nd good how be cause step Roughly , us a then to towards values? the do decide used is a to same which do or the not properties population there a quantity the about Fortunately , to amounts whether about lear n huge to be statistic from accurately is decide something samples to we of possible task can can out and Our computers and way of information technology . relevant? dierent there a or useful extremely statistics balanced So, of reasons, be sample an is of can entire a way , and it Estimator is a proper ties obtain is make attempt advancements However, Estimator reects a Many digital But we life. can to us! an represent these to as can arguments. entice your due dierent. called An of statistic population is For want for sample. ver y not might current systematically hard do methods our statistics and adver tisements. misleading data. presented ideologies, adver tisements taking how sample some way of of calculating both was drawn. estimators terms of the how of data In a special sample this chapter parameters accurately and they of a type also we of statistic. the are entire going population, model the to This statistic population study and in analyse population’s from detail how their parameters. Chapter 3 79 3.1 Estimators Suppose Y ou have samples Let’s have by that no lots of say that a specic at normal have data, to you that the but a of do draw exact sample you a statistical you the from you S θ, of that T, a we sample. that are a company . you the use entire want A The to Since estimate random variable T for is is Consider estimators you this parameter variable T parameter θ below . these population? parameter, θ. the shown for can about has approximate and How population. distributions variables, start? that value from obtain which analysis conclusions population the normal random where have by two perform and random value to estimates infer knowledge taking Look of data information A you and called called describes an an the estimator. estimate. two μ S T μ Notice to the and In S E that mean is ( S μ ) value. obviously general a population μ , ≠ In whilst this biased random case and variable parameter θ if ( T E ) we = hence T ( E T is ) since say that T should called an T is has an not a symmetrical unbiased be used unbiased for shape estimator estimation estimator for with for of the the respect mean parameter. the = θ Denition: An estimator of a population a random parameter (such as the μ, mean, or the 2 variance, The A ) specic Let is estimator us value now parameter provides of consider E (T ) = that two E (T 1 ) variable an random depends variable random = that approximation μ , and to is variables, consider this on called both the unbiased which one 1 T 2 Exploring statistical μ analysis methods estimators might T 80 data. parameter. an estimate 2 μ sample unknown be a of better the value same estimator. μ, By looking at the graphs of both unbiased estimators T and T  that T has a smaller spread than of smaller T  (or . Thus, the , notice 2 standard deviation 2 variance) T is than the standard deviation (or variance)  of T . For a good estimation, it is essential to use a random variable 2 with a small standard deviation. Denition: Given two estimators and T T  T is a more ecient of the population estimator than T  if Var (T Given that that ) < Var (T ) 2  two the say 1 2 Example we 2 distributions better of estimator is sample the  mean one with taken the from larger the same normal population, show sample. 2 ⎛ 2 X , X 1 , ..., σ +  N (μ,σ X 2 n ∈  ), ⇒ X  N ⎜ n Take μ, n observations and nd the n ⎝ parameters of the sample mean. Take m 2 ⎛ 2 Y , Y 1 , ..., Y 2 (μ,σ  N σ m ∈  ⇒ Y  N ⎜ that one is a better 2 ( X ) < Var ( Y ) ⇒ the better same population and m nd the parameters Use the denition of the sample mean. estimator: σ 1 < n Therefore the 2 σ Var from μ, ⎝ Assume observations + ), m ⇒ m 1 n estimator ⇒ < n > m a larger of better estimator. m comes from sample. Unbiased normal Two we of we will normal set random well-known scenarios i.e. estimators will need random normal the mean and variance of a variable statistics need an for to are mean determine unbiased variables, random the an variance. estimators estimator and and for the unbiased for When the mean mean value estimator for creating and of the the statistical variance, set of variance of the variables. Denition: 2 For normal random variables X ~ N( , σ ),  ≤ i ≤ n, i n X i 1 X is an unbiased estimator for X . = ∑ i =1 n 2 n (X 2 2 S 2 2 is an unbiased estimator for . S X ) i = ∑ i =1 n 1 Chapter 3 81 A well-dened We must show denition? that the above denitions for unbiased estimators of 2 μ 1 and are σ E(X ) well dened. To do so, we must show the following: = Since all of the samples are taken from the E (X population ) μ, i = = 1 , 2,..., n i n n X ⎛ E ( X = E 2 2 E(S ⎞ n ⎛ 1 ⎞ 1 1 i ) ⎜ ∑ ⎝ i =1 = E ⎟ n ∑ ⎜ n ⎠ X ⎝ = n ⎠ i =1 E (X ∑ ⎟ i ) = n μ × μ = i n i =1 2 ) = 2 σ By now we have found out that E ( ) X μ = and ( Var X ) = , but also n 2 E (( X − μ ) 2 ) = σ , i = 1 , 2,..., therefore n we can calculate the following: i n 2 E (( X − μ ) − μ ) 2 ) = ) = nσ i i =1 n n 2 E (( X 2 E ((( X i − X ) + (X μ )) − ) i i =1 i =1 n 2 = ∑ E (( X − 2 X ) + 2 (X i − X ) (X − μ) + (X μ) − ) i i =1 n 2 = ∑ (E ( X − 2 X ) + 2E ( X i − X ) E (X − μ) + E (X − μ) ) i i =1 n n n 2 = ∑ E (X − 2 X ) + ∑ i i =1 2E ( X μ )E ( X − − X ) + i ∑ i =1 E (X μ) − i =1 n n n 2 = ∑ E (X − 2 X ) + 2E ( X μ) − i ∑ i =1 E (X − X ) + i i =1 ∑ E (X − μ) i =1    0 n 2 n σ 2 = ∑ E (X − X ) 2 + nVar ( X ) = i ∑ i =1 E (X − X ) + n i n i =1 n 2 = ∑ E (X − 2 X ) σ + i i =1 n 2 ⇒ nσ 2 = ∑ E (X − X ) 2 + σ − σ i i =1 n 2 ⇒ ∑ E (X − X ) 2 nσ = 2 2 = i (n − 1) σ i =1 2 S is an unbiased estimator the population 2 n ⎛ (X 2 E (S of − X ) = E ⎜ ⎝ ∑ ⎟ n i =1 − 1 because n ⎞ 1 ⎛ 2 i ) variance ⎠ = E ⎜ n − 1 ∑ ⎝ (X − 1 ⎞ 2 = X ) (n − 1) σ 2 = σ ⎟ i n ⎠ i =1 − 1 2 Thus, In 82 the above examples Exploring 2 denitions and 3, statistical we for will analysis unbiased look at methods estimators how to use of these μ and σ are denitions. well-dened. Example Show  that the unbiased estimator of the variance can be found by using the formula: n 2 2 x ∑ n( x ) i n 2 s 2 i =1 = . n Hence 2 = σ n 1 n 2 s s that 1 n ∑ 2 show ( x − x 2 2 ) i (x ∑ i =1 − 2x i x ( + x ) ) i i =1 = Expand. = n − 1 n n n − 1 n 2 2 ∑ x − 2x x ∑ i i =1 + ( ∑ i i =1 ) x i =1 Use = n the distributive proper ty. 1 n n 2 2 2 ∑ 2 x − 2x × nx + n ( ) x ∑ i i =1 x − n ( x ) i i =1 = Simplif y = n − 1 n − the n the expression by using 1 denition of the mean. n 2 2 2 ∑ 2 x − ( n x ) ∑ i x − n ( x ) i n 2 s i =1 i =1 = Simplif y = n − 1 n − 1 by using the denition n of variance. n ⎛ ⎞ 2 ∑ ⎜ x ⎟ i n n 2 2 i =1 ⎜ = n − 1 the ( x ) n ⎝ Example Given − ⎟ σ = n ⎠ − 1  following set of data: 22, 24, 23, a an unbiased estimation of the mean; b an unbiased estimation of the variance. Method x 26, 26, 27, 25, 24, 24, 26, 25, 26, 27, 28 nd: I ∑ a 22, x 375 i = ⇒ x = = 25 Unbiased estimate of the mean is the mean 15 n value of the sample itself. n 2 2 ∑ x n ( x ) i 2 9421 2 b s − 15 × 25 2 i =1 ⇒ = n s = 1 Use of 46 = = the = 3 unbiased estimate 29 Use *Unsaved B C the GDC ∑x 22 ∑x 26 SX In unbiased standard x 23 statistics order to estimate nd of the the variance, square population One–Var... T itle 24 One-variable D the 3 the II 22 2 f or variance. calculation. 1 f or mula 7 1.1 A shor t 23 14 Method the 14 deviation s. 25. 375. 2 4 5 9421. := Sn–... 1.81265 3.28571 2 D5 =c5 2 s = 3.29 Chapter 3 83 Exercise 1 Calculate the following sets a {2, b {21, c {1, The 2 3A 4, 6, 8, 24, 4, unbiased of 10, 36, 7, 10, contains x 10 12, 28, ..., distribution estimate of the mean and variance for the data: 30, 16, 22, 18, 25, 20} 26, 38, 32, 34, 29, 37, 33, 31, 30} 133} of eggs) 14, broken is given 0  2 3 22 2 4 2 eggs in the in 40 boxes following (where each box table: i f i Find the broken The  unbiased of of the mean and standard deviation of eggs. following group estimates 70 table displays students T 0 ≤ to t Frequency the travel < 10 times, T to (minutes), taken by a school. 10 ≤ 6 t < 20 20 ≤ 3 t < 30 30 26 ≤ t < 60 60 ≤ t 7 < 120 8 Find: a An unbiased estimate for the mean b An unbiased estimate for the standard 3.2 Condence Let’s A analyze large a per from of single ever y six-week sample whole old from farm year. released broiler real-life poultr y chickens) the raises After farm. broilers population of the ,000,000 weeks, is deviation of T mean so weighed. is to found a estimate six-week-old to are be measure the mean the to sample of Using value be weight weight 2.0 kg. mean (young ready certain-size The to broilers broilers dicult broiler, are wish over six It sample we for T ; scenario: released this mean, intervals of of of a the the broilers. If However, how condent can we be that the population data satises conditions lies within cer tain limits (in kilograms) of the sample of 2.0 kg? Can we quantify such a condence of that Theorem, mean lies within the prescribed ability to information answer available such to a us, question as the depends calculations on to condence inter val depend on how much we nd know the data such population in question. Let’s assume that the 84 the Exploring conditions statistical of the analysis Central methods able to distribution calculate Limit normal Using we this are able probabilities of about given related Theorem. to the real-life data problem satises a approximation, events the the by distribution. the to a are inter val? of Our Central we the approximate population the mean Limit value the mean we’re studying. There are two calculating types situation inter vals that for you the may mean faced a with The population standard deviation is known; Case II: The population standard deviation is not First I: estimate Condence we are deviation a to going is it from interval to the for μ investigate and we is case when known population standard known. random variable X follows a normal distribution such that + 2 N (μ, σ ∼ known, sample. when the when population: I: Case X be of Case have If of condence ) then, for ∼ μ, any value n, n ∈  , the sample mean is also 2 σ normal and X N . Let’s take the level of [a, in the condence value to be n  α , − we can the of where is calculate sample Sir Ronald to be based 5%. developed the in biometrics, We [a, need b] to such In other founders the and most relation of and of was P (a variance, 20th commonly that one the (1890–1962) μ the ≤ b) = was made he used ( – means α )% values an eld for that sure the that level of of considered contributions and 1% F isher α) ( and was since level 10%, also biostatistics biologists the to the likelihood signicance to by statistical levels, study. of statistician, is design, contributed b He immense signicant and – English developed impor tant a  are eugenicist. standard He most boundaries ≤ He par ticular the we experimental used genetics. of and centur y established to which This 0.0. geneticist, early signicance. commonly 0.05, statistics”. of population nd that F isher father b] most 0., biologist, testing The The = analysis methods. hypothesis be of α Aylmer “The development inter val lies. are evolutionar y many an mean signicance called to were one of and Darwin. condence inter val α y μ Acceptance region: 1 – α Rejection region: α Rejection 2 α region: 2 x Chapter 3 85 To nd the boundaries need to conver t it to the standard normal μ x z distribution we = σ n α ⎛ 1 Φ ⎜ 1 − ⎞ ⎟ ⎝ 2 = z ⇒ P −z α ≤ ( ⎠ z = 2 2 −z − α ⎞ ≤ z ≤ α ⎜ 1 ) μ x ⎛ P z α 2 ⇒ ≤ α = α α 1 − ⎟ σ 2 ⎜ 2 n ⎝ ⎛ ⇒ P ⎜ ⎜ is in this given case by the the z ≤ z α ⎟ 2 ⎠ n = × z ≤ μ ≤ x + × n z α ⎟ 2 ⎠ n 2 inter val for the 1 − α ⎞ σ − condence mean = value 1 − α of the population formula: σ − σ × z , x ⎤ + × α z α n ⎣ are × α following z ≤ σ x x of μ 2 ⎢ values − n ⎡ The x α ⎝ So ⎞ σ × ⎛ P ⎠ σ − ⎝ ⇒ ⎟ n 2 standard 2 values for each α . ⎥ ⎦ The most common cases of α 2 condence inter vals are listed in the table σ ⎡ z below: σ ⎤ α x α − × ( 0.95 ) − 1.645 x − 1.960 though illustrate the understand 86 Exploring we use their can of = use both 2.576 ⎦ ⎤ 1.645 n σ , x a ⎤ ⎥ n σ x GDC to methods analysis − 2.576 σ , x ⎦ ⎤ + 2.576 ⎥ n obtain for methods ⎦ + 1.960 ⎢ meaning. statistical + n ⎣ Even x ⎢ ⎡ ( 0.995 ) ⎥ σ , σ = 1.960 1 Φ 2 ⎥ ⎣ 0.0 α n ⎡ ( 0.975 ) z ⎢ 1 Φ × n ⎣ 0.05 + σ x = 1.645 x 2 ⎡ 1 Φ , n ⎣ 0. z α ⎢ 2 condence nding n inter vals condence ⎦ easily , inter vals to we will better Jerzy Neyman studied France, and University four of a 83.6 cm, afterwards period from together to a with statistics, Moldovan born wor ked Ukraine, in 1934–38, F isher one of whilst and which = a He England, working Pearson, was with a of a random standard the male Inter pret sample deviation of population your of 00 at Poland, the Neyman made contribution 5 cm. in the men is Find to taken. the countr y , the 90% giving Identif y I 183.6, σ the = = The mean height condence your is inter val answer correct found for to to be the two answer. 5 x statistician. inter vals. countr y places. Method and the was  height decimal In London condence cer tain mean USA. contributions Example In Ukraine, the College major theor y in (1894–1981) the mean standard value of the sample and nd er ror. 0.5 100 Identif y α = the signicance and apply σ ⎡ f or mula ⇒ level the 0. [83.6–.645 × 0.5, 83.6 + .645 × 0.5] x , [82.7775, ⇒ [82.8, Method x + ⎤ 1 .645 ⎢ ⎥ n ⎣ ⇒ σ 1 .645 n ⎦ 84.4225] 84.4] II Use a GDC to nd the condence inter val. In the Scratchpad Statistics “T itle” “z 182.778 “CUpper” 184.422 select “x” menu, z “ME” 0.822427 “n” 100. σ 5. Inter val, method Z as σ n: are mean 90% lies When confident within entering population The GDC deviation and then set the Data input ‘Stats’. 5 183.6 100 84.4] C We Inter vals, Inter val x: [82.8, Condence 183.6 1/99 ⇒ choose Inter val” “CLower” uses this the the | 0.9 population ok Cancel 0/99 inter val. statistics standard the that Level: into deviation number of your and statistics not (n) GDC, the to you must sample calculate enter standard the the deviation. sample’ s standard itself. Chapter 3 87 Thus far, from have statistics standard the we which deviation, condence calculate condence the and proceed at doing this Example After as a follows: ∑ rst night, 2.0, 4, raw mean, the we from are you In standard the mean given which data. condence for we be case 2 ., this to must etc.) We will look 5. calculate within worms 0.5, sample the the have 0.8, came 95% sand, x surfaced 2., from 0.4, a sand. 0.9, inter val inter pret your Their 2.2, population condence and on of lengths, 0.9, that the .9, follows mean a measured .2, length of cm, were .6. distribution the with worm answer. = σ 11.3, Identif y = 12 nd the the mean standard value of the sample and er ror. 0.05 Identif y 2 11.3 − 1.960 2 × , 11.3 + 1.960 12 signicance 12 x f or mula Method and apply the σ 1 .960 , x + ⎦ ⎤ 1.960 ⎢ ⎥ n ⎣ [0.7, level σ ⎡ ⎥ ⎣ the ⎤ × ⎢ ⇒ and normal in 2 i ⎡ ⇒ asked you deviation, inter val. mean, calculate may this the only I = = (i.e. sometimes from (i.e. calculate 12 α data sample, However, Example that population Method the of inter vals  Assuming x in rainy variance size inter vals statistics to condence represent and inter val). calculate then calculated n ⎦ 2.43] II Use a GDC to nd the condence inter val. Scratchpad First, “CLower” 10.1684 “CUpper” 12.4316 enter list. Then, := in lengths the of the Statistics menu, Sn–1X” Condence 0.636753 “n” 12. σ 2. then z set the Inter vals, Data select input σ 2.43] Frequency We are mean 88 95% lies Exploring confident within this statistical that the population inter val. analysis methods C List: Level: 2 | a 1 0.95 ok z a data choose Inter val, method Inter val List: [0.7, as 1.13159 2/99 ⇒ wor ms 11.3 “x” “ME” “SX the Cancel as and ‘Data’. When planning sample size inter val shows at a how a 99% more x ⇒ −1 ≤ ⇒ x − are 1 you order from the required may to actually obtain mean. sample a The need specic to calculate the condence following example size. of that 2 cm. the for Find sample interior how mean constr uction many beams doesn’t must dier from and be the length sampled the beams’ of so beams that mean we μ − x x Rewrite ≤ 1 Use ≤ 1 μ ≤ − ≤ x + the as an 99% inequality. condence inter val σ ⎡ 1 x σ 2 .576 , + x 2 ⇒ 2 576 × = ⎤ 2 .576 ⎥ n ⎣ 2 = would by ⎢ σ has  cm. 1 ⇒ μ the in manufactured deviation condent ≤ distance calculate beams than μ − sur vey , choose  standard be statistical must certain to Example Wooden a you n ⎦ 1 n ⇒ We n = 4.514 should Exercise 1 2 Given take the = 5, b x = − 11 , c x (1 – the σ = σ of n = 2 the population nd the The sample sample standard deviation size. size must be a positive 10 4.3, 2, = 327, A n = 230 a and for – α )% condence inter val for the α = the mean nd the if: α {321 , 325, 330, 324, 325, 326, 317, 318, 329, 310, 314, 318, 327, 322, 328}, grams (g), came n 5.5. elements Find the doesn’t 70, 75, from a normal Determine a 95% is taken value dier clementines are: = 0.1; σ = 0.04 α and σ = = 0.01 ; 4.5 0.05. mean 8 and deviation σ, = buys 2 standard A σ = with c Nayla if: 0.1 {0.1 , 0.12, 0.15, 0.18, 0.13, 0.12, 0.09, 0.11 , 0.13, 0.21 , 0.15}, of mean 0.05; = = sample 9}, (1 A with 8, a b = 7, α population inter val 6, nd {1, the integer. 0.01; 22 and from 5, = = α 4, = α , and α A sample 3, n n, and a A and beams. x , σ , condence deviation 4 1 , sample and 3 = 2854, )% least values σ x Given at Use = 20.376 3B a = ⇒ n 77, and 71, n from so 68, 80, inter val normal we population would population them. 85, with a that the weighs population condence of from 72. Their We standard for the be by weights, may mean 95% mean of weight has a standard condent more shown assume deviation that that than that 2. in these clementines 3.5 g. of clementines. Chapter 3 89 The 5 measurement assumed to of follow a n independent normal random distribution measurements with the mean may value be μ 2 and for In the the mean is a the mean b the sample order length to of σ variance found value better a = 9. of size Given to be the that the 90% condence inter val nd: 14.2,17.4 sample; n understand condence the inter val, inuence we are of the going to sample look at size the on the following investigation. Investigation Let’s consider same mean samples value of dierent = 100. x The sizes, samples all of come which from a have the population 2 with standard Find a a 95% values of i n = 10 ; n = iii n = 50 ; iv n = 150 II: can real-life so it is of you 64 inter val n given conclude unlikely for the mean for the following that: there is about that we Even hardly the sample will if any Justify when is constantly know we relationship sizes? for μ populations population. distributed, and interval situations, ver y obser ved size inter vals Condence In = 25; lengths Case σ condence sample ii What b deviation the assume situation between your unknown grow or decay , parameters that it when the answer. is of the A normally we are degree is cer tain the that about population parameters. Thus, we often need of can var y, the parameter to of take like to a Before  illustrated certainty large be working as our sample accurate with we to small estimate for as or that estimates. practical possible large the larger in samples give However, reasons. our it We a is greater dene often would, conclusions, dicult however, standard deviation σ , we a concept that we will use to make our we of If we σ is have unknown a sample we of must size n, rst estimate since we it need of the of the of , we say that we have = n 90 Exploring statistical analysis methods sample and we an have estimation standard − 1 deviation population, data can then var y available calculate − 1 degrees last one must prescribed by whilst the n pieces be previous data. − 1 data in order an obtain the standard of deviation freedom. a example, must to estimation n are For estimate: from to have size calculated when samples. population if the When without population we estimating. n rst v, data them. Investigation degree of to changing estimate freedom, number we require. To estimate the population standard deviation σ we use n estimate of the population standard unbiased 1 = σ s deviation an . n The Central distribution question of still William quality titled name The is not as early the Despite hard of work the called size exceed doesn’t t-distributed deviation X Z by σ of his new a small with At the end he by his formalized. of of a numbers; 1935, brewer y continued paper strictly result, he the a samples random Guinness venture, employer As results of in published t-distribution method. the this the work of smaller t-distribution variable estimate − to in London. publish samples, after we Gosset) use t-distribution when sample of the T, we approximate standard the standard deviation, s: μ , = s n n the variable T freedom and Student’s t-distribution distribution the y-axis the y-axis. normal in of statistician papers. control the distributed? he statistical but the 30. random the ⇒ T use charge distribution σ Let’s form empirical a because quality Monte-Carlo involved X = where the as 1908, scientic impor tant and Student’s μ − the take any In approximate distribution, worked ‘Student’ can samples brewer y. ensure discovered to (commonly a the to normal 1937) a we papers. approximate For as used a that smaller − of publishing mathematical Ireland the statistical (1876 well-known Gosset by are pseudonym from application left How depar tment initially of suggests sample Gosset under beer . Gosset To Sealy was combination an large employees t-test brewed a Theorem remains: control ‘t-test’ forbade Limit we graph. and a follows write T GDC They to distribution ∼ t (n graphs achieving look and a are t-distribution are ver y similar bell-shaped maximum at the some with n − 1 degrees of − 1) . to the cur ves; value when probability normal symmetrical the density t-distributions standard with cur ve about crosses function dierent of the standard degrees of freedom. Chapter 3 91 The for graphs the show larger degrees t-distribution insignicant that better as this of distribution freedom (i.e. approximates under the normal is aligned for the larger data with the standard samples), since the tail but areas for normal smaller are a bit distribution samples larger and the not as distribution. *Unsaved 1.1 y 0.5 f1(x) = normpdf(x, 0, 1) f2(x) = tpdf(x, 1) f3(x) = tpdf(x, 5) f4(x) = tpdf(x, 10) 0.2 0.1 x –5 –4 –3 –2 –1 –1 –2 –3 –4 –5 –0.1 Since we the can standard adapt distribution Before tables p = the ≤ to formula nd a used to distribution for formula introduction were P(X the normal nd of the critical the t-distribution condence for GDC and the inter val condence of the inter val of technology , z-distribution values. Here is an example cur ves mean the and of are of a mean similar of a t-distribution t v = 0.9 0.95  3.078 6.34 2.706 3.82 63.657 636.69 2 .886 2.920 4.303 6.965 9.925 3.599 3 .638 2.353 3.82 4.54 5.84 2.924 4 .533 2.32 2.776 3.747 4.604 8.60 5 .476 2.05 2.57 3.365 4.032 6.869 Nowadays calculators have a built-in 0.99 “invt” 0.995 feature, i.e. a variables that calculates the critical t value; t = invt c ⎜ the condence inter val can be calculated x − ⎣ statistical analysis methods t , x + formula: t c n , ν − 2 ⎤ × c n the s × ⎢ Exploring using s ⎡ 92 by with α 1 ⎝ So 0.9995 function ⎛ two ⎥ ⎦ a t-distribution. t-distribution p 0.975 shape, normal t) P in . table. Regardless standard give us book critical we are deviation is Example In x a = the to fact that the distribution t-values going for use any t-distribution for a sample t-distribution large size is ver y sample we choose. whenever the close size, to the calculators Therefore, population in can this standard unknown.  sample of 2540 kJ, Calculate of normal ve 00 g whilst the 99% the chocolates estimate condence for the the inter val mean value population for the of the energy standard mean value level deviation of the was was energy found s level to be = 120 kJ. of 00 g of chocolates. Method n = 5 I ⇒ ν = 4 ⇒ t = 4 = 604 n − 1 and use a GDC to nd the critical c t-value. 120 s 2540 ± × 4 604 Use the f or mula x ± × t c 5 n ⇒ μ ∈ 2787 ] kJ [ 2293, *Unsaved 1.1 invt(0.995, 4) 4.60409 120 2292.92 5 120 2540+ 2787.08 ·4.6040948673027 5 3/99 Method II Use *Unsaved 1.1 “T itle” “t “CLower” 2292.92 “CUpper” 2787.08 the input Inter val” GDC the t-inter val given stats input method and statistics. *Unsaved “x” 2540. t “ME” “df” Inter val 247.082 4. x: “SX := Sn–1X” “n” 2540 120. Sx: 5. n: 120 5 1/1 C Level: | 0.99 ok ⇒ μ ∈ [ 2293, 2787 ] kJ, given correctly to Cancel the 0/99 nearest Let’s tr y kJ. to see how to solve the problem of the mean weight of broilers. Chapter 3 93 Example A large the  poultr y sample was population 90% farm found with inter val grows for to be broilers. 2.0 kg. the unbiased the mean A It estimate weight of a sample is of 20 assumed of the broilers that standard is broilers taken. come deviation of The from mean a 0.3 kg. weight of normal Calculate the broiler. Working: Method n = 20 I ⇒ν = 19 ⇒ t = 1 729 = n − 1 and then use a calculator to nd the t-value. c 0 2.10 s 3 ± × Use 1.729 the f or mula x ± × t c n 20 ⇒ μ ∈ [1.98, 2.22 ] kg *Unsaved 1.1 invt(0.95, 19) 1.72913 0.3 1.98401 √20 0.3 2.21599 √20 3/99 Method II Use the GDC t-inter val stats *Unsaved 1.1 “T itle” “t 1.98401 “CUpper” 2.21599 “x” input Inter val” “CLower” the given statistics. 2.1 *Unsaved “SX “ME” 0.115994 “df” 19. := Sn–1X” t Inter val 0.3 x: Sx: n: 2/99 C ⇒ μ ∈ [1.98, 2.10 20. “n” Level: 0.3 20 | 0.9 2.22 ] kg ok Cancel 0/99 When data 94 the actual features Exploring on data the statistical is given, GDC, analysis as we need shown methods in to use the the list following and statistic example. input method and Example In a 66. box of six Calculate Method eggs, the the 95% weights (measured condence inter val in for grams) the of mean each egg weight were: of an 62, 63, 65, 62, 66, egg. I x ∑ x 9 384 i = ⇒ x = = n 64 Find the mean value Find the unbiased of the sample. 6 2 2 x ∑ s n ( ) x 24594 i = ⇒ n s 24576 = =1 1 897 standard n = 6 ⇒ = 5 ⇒ t = 2 of the population deviation. = n and − 1 then we 571 look 1 64 estimate 5 at the tables or use a calculator to nd 897 ± × 2 571 the t-value. 6 s Use μ ∈ ⇒ [ 62, the f or mula ± x × t c 66 ] g n Method II Use *Unsaved 1.1 “T itle” “t where Inter val” “CLower” 62.0088 “CUpper” 65.9912 the GDC data a method list. 64. t Inter val 5. “df” := in input 1.99116 “ME” “SX stored data *Unsaved 1.1 “x” is t-inter val List: | a List: 1 1.89737 Sn–1X” Frequency 6. “n” C Level: 0.95 ok Cancel 3/99 {62, 63, 65, 62, 66, 66} μ ∈ ⇒ [ 62, 66 ] g 2/99 Example A 0 random sample of ten inde pendent obser vations are taken 10 from a nor mal 10 2 population. The sample gave the x results = 527 and x i i =1 Find a 90% ∑ x condence x inter val for the = 28,157 . i i =1 population mean. 527 i = ⇒ x = n = First 52.7 we must nd the unbiased estimate of 10 the mean of the sample and then the unbiased 2 2 ∑ s x n ( x ) estimate of the population standard deviation. i = n 1 2 28157 − 10 ⇒ s = × 52 7 = 6 5328 Now, use the GDC t-inter val stats input method 9 and input the given statistics. *Unsaved 1.1 “T itle” “t Inter val” “CLower” 48.9131 “CUpper” 56.4869 “x” *Unsaved 1.1 t Inter val x: 52.7 “ME” 3.78694 “df” 9. Sx: n: “SX := Sn–1X” “n” 6.5328 10. C Level: 52.7 6.5328 10 0.9 ok Cancel 1/99 0/99 μ ∈ [ 48.9, 56.5 ] Chapter 3 95 Exercise 1 Given the inter val 2 3C values for = 15, the a x b x = − 23, c x = 3478, Given the s the mean = 1.2, s = s set x , = A n = 15 5.8, n 429, of s, n, α , and nd (1 – α )% condence if: = n data α and 32 = α and = 310 nd 0.1; and the (1 0.01; = α = 0.05 . α )% – condence inter val for if: = {1, b A = {0.1, 0.12, 0.15, 0.18, 0.13, 0. 12, 0. 09, 0. 11, 0. 13, 0. 21, 0. 15} = 3, Nadim W e 5, 6, 7, 8, 9} and = and α = . 0.1 buys 12 mandarins given in grams may assume Determine a that 99% (g): the and 66, weighs 70, 75, mandarins condence them 84, 90, came interval for with 77, from the the 71, a following 68, 80, normal mean 85, 72, 63. population. weight of the mandarins. 4 Eight independent population with random the measurements unbiased estimate of are taken from population a normal variance 2 s to 5 = 2.25. be Given [12.345, the mean the condence box a with cod market at are standard a Find bought b Find a bought c value 15 is market 95% at on condence is at a sh market. Weights the mean is found and the of The cod mean bought unbiased weight at the estimate of the 95 g. inter val for the mean weight of cod inter val for the mean weight of cod market. market. the signicance inter val. statistical for inter val. 536 g. condence the the distributed is inter val sample; of condence the 99% at the bought normally Comment Exploring cod the of condence nd: level deviation a the 14.355 ] b of 96 that a A = 0 321 , 325, 330, 324, 325, 326, 317, 318, 329, 310, 314, 318, 327, 322, 328 and results 4, 0.01; A A 2, α a c 3 of mean analysis methods level and the length of the 05; Condence Matched interval pairs population. matched are In of product, two satisfy the Matched when be in from in one samples in before and is sor t after of the on used same is the We or same might compare the compare not take the two product two is samples means dierent of a group of the samples variable against group of measure treatment, to a circumstances. would used same whether whether to content We medical the and the production. dierent iron analysis and would determine An people the same these form example whether suer sets not both of matched or might group two our who itself a pair. new dr ug eective. Alter natively , of be from know factories. factor y , of taken suppose determine condition. measurements This to pairs to example, each two the want dierent also in samples we standards can measuring blood two order same pairs For from measured a study , dier. manufactured matched dierent our pairs for people might they same we who follow measure have would Matched experimental of of a group one are and group (such the drug control the group. This the experimental the effects of the used on a as a lot gender , on to the in age, sets we We match of weight, without the a of study weight. diet to a group We see whether results from experiments. might the etc.). effect control monitor and other experimental minimize which, two lose the to the pair. elements the to after biological group. analysis order and population, the had and pairs in matched control done group, a diet before goal, our with has is matched cer tain their drug characteristics impact a use weights form pairs effect elements their achieved group the might We would group, of determine groups: compare group all based then we the on cer tain with inuences might an compare compared other group, T o two the on ascribe to dr ug. Chapter 3 97 Example A group two of new days iron  patients dr ugs: when tests the at dr ug a A hospital and inuence were of conducted Patient dr ug dr ug and suer B. A the from First, had chronic they wor n results b c d e deciency . treated they g/dl ) (in a were o, iron with were are f dr ug treated shown in g h Dr ug A 60 58 47 80 35 55 53 40 Dr ug B 63 55 42 76 29 6 5 4 a Find b Calculate the dierences the 90% between the condence results inter val obtained of the by mean dr ug of A They treated A and, after with dr ug B. the and are following dr ug dierences in a with few Ser um table: B. par t a a d = −3 A − B 3 5 4 6 −6 2 − Subtract the results of drug B from those i of Method b ∑ d d 10 i ⇒ d = = 1 n Calculate the mean of di erences. Calculate the unbiased 25 8 2 2 ∑ d n ( estimate of the ) d 136 i = ⇒ s d 12 = 5 = 4 200 standard deviation. d n n A. I = s drug = 8 ⇒ 1 = 7 7 ⇒ t = 1 895 Find the degrees of freedom and use a c GDC to nd the characteristic value of t. 4.200 1.25 ± s × 1.895 d Use the f or mula d ± × 8 t c n ⇒ μ ∈ d Method [ − 1.564, 4.064 ] II Store data Apply 1.1 the into lists and t-condence nd inter val *Unsaved 1.2 the “T itle” “t di erence –1.56353 “CUpper” 4.06353 1.1 1.25 *Unsaved 1.2 A “x” list. Inter val” “CLower” B a C D b 2.81353 “ME” =a[]-b[] 7. “df” “SX := Sn–1X” 4.20034 1 60 63 –3 58 55 3 47 42 5 80 76 4 35 29 6 8. “n” 2 3 1/99 4 5 ⇒ μ ∈ d 98 [ − 1.564, Exploring 4.064 ] statistical analysis C methods =a b the di erences. with data in In the and previous positive there such is a a reasonable Given and we can and the we between do 90% condence cannot the another conclusive say at eects this of dr ugs hypothesis results. You inter val level test will of A that lear n contains both condence and can B. Sometimes give more negative that us about in more this in 6. Exercise 1 so dierence case Example example, values, 3D two nd sets the (1 of data – )% in the following condence tables, inter val for calculate the mean the of dierences the dierences: a Set A 5 7 23 7 8 20 9 6 6 2 Set B 8 5 23 9 5 8 20 5 8 22 α = 0 01; b Set C 98 03 02 88 96 05 0 93 02 06 99 85 Set D 2 5 04 96 0 2 23 02 08 09  03 α = 0 ; 1 c Set E 0.55 0.7 0.66 0.58 0.82 0.77 0.9 .02 Set F 0.62 0.68 0.74 0.69 0.78 0.65 0.8 0.95 α 2 Bob = and levels of 0 Rick (Hgb) that They 05 are in the laborator y same oxygen-transpor t obtained Blood two sample the blood technicians samples of metalloprotein following results in is 12 and they male measure patients. between 138 and haemoglobin The normal level 175 g/l. g/l:  2 3 4 5 6 7 8 9 0  2 Bob 44 53 70 83 25 95 48 77 60 55 70 35 Rick 4 6 73 74 9 04 35 75 64 58 67 42 a Find b Calculate the obtained dierences a by 95% Bob between condence and the results inter val of obtained the mean by of Bob and Rick. dierences in results Rick. Chapter 3 99 3.3 Hypothesis Setting In up order not yet that W e a and to testing formulate been new wish proved dr ug to set to up parameter. The only for and such our In general, of testing that the hypothesis The stated be test, ght test a always the study we are of are be essential usually this a a Higher mean topic to a or us to with we each been a the has a but been current whether about studies. proposed claim this has made dr ug. claim is tr ue. population includes normal oriented hypotheses statistical than syllabus random be contradictor y provides better determine Level will discussing has stated claim of of suppose works with part theor y example, hypothesis accepted is an infection star ts directly hypothesis is For population hypotheses can a tr ue. Mathematics when statements such to and testing testing hypotheses help Hypothesis as testing distribution, thus. consider other. arguments as two The to process why a cer tain rejected. called the null hypothesis and we denote it by H 0 and the alternative hypothesis states the opposite and is denoted by H 1 Let’s consider again the example of broilers y from f(x) section 3.2. Suppose that data from previous years x suggests that population weight of the is mean 2 kg. broilers We this weight wish of to year, the broiler estimate and to do the so Acceptance mean we take Rejection a sample found to of be a cer tain size. The sample mean region null Rejection 2.0 kg. x hypothesis always states no value Critical that the mean weight of value change, Two-tail i.e. region is Critical The region test broilers y this year is also 2 kg. We write this as ( : μ H = x 0 ) x Acceptance The alter native statements to hypothesis depending on can the have type of region dierent test we wish perform. Rejection There three i are two types of Two-tail types the test of hypothesis alter native (H : μ ≠ x) testing Critical and One-tail hypotheses: The mean weight is upper region value x test y 1 not equal to x 2 kg. Acceptance ii One-tail upper test more than (H : μ > x) The region mean 1 weight is 2 kg. Rejection iii One-tail lower test (H : μ < x) The region mean 1 weight is less than Critical 2 kg. One-tail 100 Exploring statistical analysis methods value lower x test W e must also decide which level, α, signicance we require in order The to conclude that a certain hypothesis is valid, with − α ) ( % most common certainty . signicant are Signicace we are 95% inter val of the level sure we can in the We will the variance There the use are two of the otherwise the have null We have value is the value null is condence in the hypothesis section, when If make the the do is is no not the tr ue. If region, reject the we at inter val, calculated the 5% so 1%, 5% and 10%. if condence signicant level when use of or z- or we known, sample based t-value found the of and condence or t-statistics. t-statistics when size). upon t-value reject the either z-statistics is the calculating the on the null calculated the test values. signicance lies outside hypothesis, it. the that the parameter rejection p-value accept sucient will decision calculated is the evidence region, g reater we’re given than the null-hypothesis, to reject’, or investigating that the null signicance but simply rather ‘fail to say level that reject’, hypothesis four steps in hypothesis 1 State Step 2 Set Step 3 Calculate Step 4 Decide the we in the the testing calculation z-statistics which a as variance z-value within that we the probability lies just (regardless Step in to statistics, acceptance say Hypothesis (see to mean) the As the unknown test. p-value cannot ‘we related mean testing ways we hypothesis we accept previous is so-called (i.e. the z-statistics Critical The that calculating inter vals level directly used test. When The is levels Example the upon for of already and criteria μ for alter native a hypothesis; decision; statistics; calculated when is condence hypothesis have null testing: testing. the and decision criteria. known inter vals, Let’s calculated statistics look at we an are going example condence to use for inter val 4). Chapter 3 101 Example In a  cer tain and the countr y standard population a State b Use a H and the a is believed deviation the null it mean and two-tail test is that 5 cm. height A was the mean random found to alter native hypotheses; at signicance the 0% : “The mean height is 82 cm” : “The mean height is not ( μ = height sample be level 82) of of the male 00 men population was taken is 82 cm from the 83.6 cm. to decide Step 1: whether State the or null not the claim hypothesis is tr ue. that 0 conr ms H the claim. State alter native 82 cm”  hypothesis. (μ ≠ 182 ) Method b I 5 μ = 182, σ = = 0.5, x = 183.6 Step 2: Calculate Step 3: Find Step 4: Compare the z-value. 100 x z − μ 183.6 = ⇒ z − 182 = = σ 3.2 0.5 Either 1 α = 0.1 ⇒ z = Φ ( 0.05) = 1.645 the z-critical value. α 2 3.2 > 1.645 ⇒ z > z the z-critical value with the α 2 We reject the null calculated value and make a decision. hypothesis. Or P ((180.4 > X ) or (X > 183.6 ) μ Step = 182 ) 3: For the cor responding = 1 − P (180.4  1  P ( 3.2 <  0.0037 < 0. the 0% Method X Z < 183.6  3.2 ) we  reject signicant μ z value nd the = 182 ) 0.00137 the null hypothesis at level. Step 4: Compare signicant Use II z-test 1.1 1.1 calculated p-value. level the and statistics p-value make f eature a with the decision. on a GDC *Unsaved 1.2 *Unsaved 1.2 z Test zTest 182, 5, 183.6, 100, 0: stat.results µ0: “T itle” σ “ Alternate 182 “z Test” Hyp” “µ “z” ≠ “PVal” 5 µ0” 3.2 x: 183.6 0.001374 n: “x” | 100 183.6 “n” 100. σ 5. Alternate Hyp: Ha: µ ≠ µ0 ok 1/99 0.0037 < 0. we reject 0/99 the null hypothesis. Compare level 102 Exploring statistical analysis Cancel methods and the p-value make a with decision. the signicance Due to the symmetrical acceptance impor tant We  region found inter val. Y our is it the to We we null 0.0037 Let’s be of test, condence [82.8, therefore gives simpler to the this condence example inter val 84.4] and expected you make compare signicance the proper ties two-tail inter val highlights and the some the a to for we the see reject mean that the null calculated z-value decision based 82 upon of is the not in this hypothesis. “z” = 3.2, but the p-value it when available. When  90% calculator much is a results: population  in use levels the (%, hypothesis is smaller another p-value 5%, upon than example and all each 0%), three each for with and we the we common conclude signicance ever y which of one have of that levels, we reject since them. already calculated the We condence the inter val Example After a worms 2.0, and a State b Use is ., from the the a in Example 5 2 worms 0.5, a 0.8, null and deviation test at 0.4, that alter native upper surfaced 2., population standard one-tail have 0.9, follows of on 2 cm. a sand. 2.2, Their 0.9, normal There is a lengths, .9, .2, distribution belief measured that .6. with the It the in is cm, known mean worms were are as that value the of growing larger. hypotheses; the 5% signicance level to decide whether or not the claim tr ue. H a night, came 0.5 cm condence  rainy follows: calculated inter val. : “The mean length is 0.5 cm” : “The mean length is larger ( μ = 10 5) Step 1: State the null hypothesis that 0 conr ms H the claim. State the alter native than  hypothesis. 0.5 cm” ( μ Method b > 10 5) I 2 μ = 10.5, σ = = 0.577, x = 11.3 12 μ x z 11.3 = ⇒ z 10.5 = = σ 1.38564 Step 2: Calculate Step 3: Find Step 4: Compare the z-value. 0.577 Either 1 α = 0.05 ⇒ z = Φ ( c 1.38564 < 1.645 ⇒ z 0.05 ) = 1.645 < z the z-critical the value. z-critical value with c the We have no sucient evidence to reject the calculated value and make a decision. null hypothesis. Or P (x > 11.3 μ = 10.5) = P ( z > 1.38564 ) Step 3: For the cor responding = calculated z-value nd the p-value. 0.0829 Chapter 3 103 Since to 0.0829 reject the > 0.05 null we have hypothesis no at sucient the 5% Step evidence 4: Compare signicance signicance level the p-value and make with a the decision. level. Use Method 1.1 II the z-test data on a GDC to nd *Unsaved *Unsaved 1.2 f eature p-value. z Test “T itle” “z Test” µ0: “ Alternate Hyp” “µ “z” > 10.5 µ0” σ 1.38564 “PVal” 0.082928 “x” 11.3 2 List: Frequency “SX := a List: 1 0.636753 Sn–1X” 12. “n” Alternate Ha: Hyp: µ > µ0 2. σ ok Cancel 1/99 0/99 0.0829 > 0.05 we have no sucient evidence to Compare reject the null hypothesis at the 5% signicance level Condence through inter val region When one-tail is for we two upon levels the Exercise is the is 5%, we and 0.0829 > or to the lower), whereas Example 0%) level no 0.05 compared (upper p-value an and make results since a a believed that with a random the mean obtained condence acce ptance or rejection it’s 3 clear (0.0829 sucient and with 0.0829 the that < we 0.), reject whilst evidence > three to common the upon reject null the the hypothesis remaining null 0.0). sample μ value of n and obser vations the standard is taken from deviation σ the . 0 The sample : μ H = μ 0 = the : μ H 0 n a , has mean μ ≠ 1 μ 20, value test the of − x . Given claim at the 2 α = the α hypotheses signicance level if: 0 = 10, x = 12, σ = and 0.1; 0 n b = μ 25, = 2, = 1.9, σ x = 0.3 and α = 0.05; 0 n c = μ 119, = −235, x = σ −238, = 12.8 and α = 0.01 0 2 It is believed population that with a random the mean sample μ value of n and obser vations the standard is taken from deviation σ . 0 The sample : μ H = μ 0 a has , H 0 n = 20, the : μ < mean μ 1 μ value test the of − x . claim Given at the the α hypotheses signicance 0 = 10, x = 9, σ = 4 and α = 0.1; 0 b n = 50, μ = 21.4, x = 21.2, σ = 0.75 and α = 0.01; 0 c n = 119, μ = −235, x = − 238 , 0 104 Exploring statistical analysis the decision. 3E population with symmetric. in have be mean, not signicance levels (since the p-value (%, 0% cannot testing about testing compare signicance It hypothesis one-tail hypothesis 1 analysis symmetric signicance only inter val the level. methods σ = 12.8 and α = 0.01. level if: the signicance 3 It is the believed that population a random with the sample mean of n μ value obser vations and the is taken standard from deviation 0 . The : μ H sample = μ 0 a , = the μ > 1 μ 20, : μ H 0 n has mean test value the − x . of claim at Given the the hypotheses signicance α level if: 0 = 10, = 12, σ x = 5 α and = 0.05 ; 0 b n = μ 40, = 27.3, x −73, x 28.0, σ = = 3.6 and α = 0.1; 0 c n = μ 92, = − 71.6, σ = = 3.72 α and = 0.01 0 4 After a Their 25, rainy weights, 30, came 35, 27, from a mean value belief that a State b Use the an A eight 1.35, is 6 is State b T est An the the not the a of the following 288, 293, 301, 298, 299, 289, 304, is a State b Test a belief the at Hypothesis null the are at is level A box (in 302, 20 ml) 290, a the alter native for μ fat with 25, forest. 31, snails 35, in of that 3.5 g. 28, the forest with There 38, is the a population. level to decide in bottles 100 of a of 0.3 g. were is ml a In lactose a sample measured: belief more that free 1.43, the of 1.52, company fat. hypotheses. whether apple or juice. volume bottles 288, signicance from the distribution signicance fat were and testing of 288, that 10% of level of that normal not 1% pure 285, 302, a 22, deviation There drink have known deviation levels 100% follows: in tr ue. alter native claim is har vested hypotheses. the 1.55. free volumes 295, There 1.45, It as follows signicance 7.3 ml. 28. standard lactose they were snails the been standard claim produces that that the following 5% g, 33, test that and have alter native the 1.46, null the deviation and and in 34, har vested with eco-farm bottles 26 g 1.42, at 30, claims producing a in or drinks 1.38, 29, appropriate 1.4 g snails measured null company drink 15 population of the whether 5 period, of is not the They 300 ml taken claim package and for is a true. juice standard inspection and measured: 290, 300, 305, 300, 293, contain less 305. volume than stated. hypotheses. level when whether is or not the claim is tr ue. unknown We As the with calculations standard t-statistics, of deviation irrespective condence when of the inter vals, hypothesis sample if testing size. we we do not use know have found condence this inter val instance Example already the for in 7. Chapter 3 105 Example A  chocolate In a sample of ve 2540 kJ, be x a State b Test a H = company the at 00 g whilst null the claims and % that the chocolates the estimate alter native signicance energy the for level mean in value population cer tain of the 00 g energy standard chocolates level deviation was was s is 2500 kJ. found to = 120 kJ. hypotheses; level whether : “The energy level is 2500 kJ” : “The energy level is not (μ or not the = 2500 ) claim Step 1: is tr ue. State the null hypothesis that 0 conr ms H 2500 kJ” (μ ≠ the claim. State the alter native 2500 )  hypothesis Method b x t a two-tail test. I μ − f or 2540 = ⇒ t − 2500 = = s 0.7454 120 n Step 2: Calculate Step 3: Find Step 4: Compare the t-value. 5 Either n = 5 ⇒  = 4,  = 0. 01 ⇒ t = 4. 604 the t-critical value. c 0.7454 < 4.604 ⇒ t < t c the We have no hypothesis sucient at the X ) or % evidence to signicance reject the calculated the value t-critical and value make a with decision. null level. Or P (( 2460 > (X > μ 2540 ) = 2500 ) Step 3: For the cor responding = 1 − P ( 2460 < x <  1  P ( 0.74536  0.497 > reject the Method 0.0 we null 2540 t  have μ = hypothesis  the evidence % to signicant Step level. 4: nd the p-value. Compare signicant Use II t-test level stat the and p-value make f eature on a a with the decision. GDC. *Unsaved 1.1 *Unsaved 1.1 value 0.497 sucient at t 2500 ) 0.74536 ) no calculated t Test “T itle” “ Alternate “t Test” Hyp” “t” “µ ≠ µ0” µ0: 2500 x: 2540 0.745356 0.497471 “PVal” Sx: “df” 4. “x” 2540. n: “SX := Sn–1X” 5. “n” 120 | 5 120. Alternate Hyp: Ha: µ ≠ µ0 ok Cancel 1/99 0/99 0.497 > 0.0 reject the we null have no hypothesis sucient at the evidence % to signicant level. Compare the signicance 106 Exploring statistical analysis methods p-value level and with the make a decision. Due to the symmetrical acceptance important found 00 g the this evidence Y our but a properties two-tail 90% chocolates within ● in of test, the this condence example interval highlights and the some results: We ● region is to be inter val. to reject calculator it condence much [2293, inter val 2787] Therefore, the null gives you simpler to we for and do the we not energy see have that level 2500 in lies enough hypothesis. the calculated t-value make a decision based “t” = upon 0.745, the p-value. When ● we compare signicance levels the p-value (%, 5%, with and each 0%), of we the common conclude that We we the have no evidence to reject the null hypothesis upon of these three signicance levels since 0.497 is for we studied Example Example a box 9.  of six 66. eggs, 65, 62, 66, the 5% signicance H problem each. when In the larger following than condence ever y inter val one calculated The the weights farmer level (measured claims whether : “The mean weight is 66 g” : “The mean weight is less (μ that the = the in grams) mean eggs in of weight the boxes of 1: egg one have Step 66 ) each a of were their mean State the as follows: eggs weight null is 62, 66 g. less Test than hypothesis 63, at 66 g. that 0 H than 66 g” (μ < 66 ) conr ms the claim. State alter native  hypothesis Method x = t = 64, x − f or a one-tail lower test. I s = 1.897 μ 64 ⇒ t − 66 = = s Step 2: Calculate Step 3: Find Step 4: Compare the t-value. − 2.582 1.897 n 6 Either n = 6 ⇒ν = 5,α = 0. 05 ⇒ t = −2. 015 c −2.582 < −2.015 ⇒ t < t c We reject the signicant null hypothesis at the 5% calculated the level. the value. acceptance t-critical the value. t-critical Notice region so that we value t lies make with the outside a decision. Or P (x < 64 μ = Step 66 ) 3: For the cor responding P ( −2.582 0.02466 < < hypothesis t ) = 0.05 at calculated t-value nd the p-value. 0.02466 therefore the 5% we reject signicant the level. null Step 4: Compare signicance level the p-value and make with a the decision. Chapter 3 107 Method 1.1 II Use *Unsaved 1.2 “T itle” “ Alternate the “t Test” Hyp” “µ “t” < t-test stat on a GDC p-value. µ0” –2.58199 1.1 “PVal” f eature *Unsaved 1.2 0.024657 t Test “SX “df” 5. “x” 64. := µ0: 1.89737 Sn–1X” List: 66 a 6. “n” Frequency Alternate List: Hyp: 1 Ha: µ < µ0 1/99 ok Cancel 0/99 0.024657 < hypothesis 5% As 0.05 that the signicance with cannot symmetric about 1 A (upper the one-tail Exercise is the null 66 g inter val compared t-testing for reject weight condence be hypothesis region mean we at the level. z-testing, variable therefore to or mean, t-testing the lower), whereas is not analysis results since an for a t-distributed obtained a through condence acceptance or one-tail inter val is rejection symmetric. 3F random sample population with of the n obser vations mean value μ . is taken The from sample the has the mean 0 value x of deviation and s. an unbiased Given the estimate of : μ H hypotheses the population μ = 0 claim n a at = the μ 10, signicance α = 5, x = 4.8, level s , H 0 : μ ≠ standard μ 1 test 0 if: = 1.3 α and = 0.01; 0 n b = μ 25, = 2, x = 1.9, s = 0.3 α and = 0.1; 0 n c μ = 7, = −36, x = −35.3, s = 0.523 α and = 0.05 0 2 It is believed from the that a random population with sample the of mean n obser vations value μ . The is taken sample 0 has a mean population : μ H μ = 0 n x of standard , H 0 level a value : μ the deviation μ < and 1 test unbiased is the s. estimate Given claim at the the α of hypotheses signicance 0 if: = 30, μ = 15, x = 14.2, s = 2.2 and α = 0.05 ; 0 b n = 10, μ = 122, x = 119.8, s = 2.32 and α = 0.01; 0 c n = 6, μ = 627, x = 622.8, s = 12.6 0 108 Exploring statistical analysis methods and α the = 0 .1 . the to nd It 3 is the believed that population a random with the sample mean of n obser vations value . The estimate of sample is taken has the from mean 0 value x of deviation and s. the Given unbiased the : μ H hypotheses the = population μ 0 claim n a at = the μ 20, signicance α = 1 , level , : μ H 0 standard μ > 1 test the 0 if: x = 0.95, s = 0.335 x = 26.4, s = 1.12 and α = 0. 1 ; 0 n b μ = 8, = 25, α and = 0.05 ; 0 n c μ = 15, x = 754, = 758.6, s = 14.2 and α = 0.01 0 An 4 ice-cream par ticular in a factor y ice-cream box and 119, 123, 121, and test their 120, claims that product volumes 118, 116, at the 1% adver tised the correct is in ml 123, signicance the average 120 ml. are 122. There as are of eight a ice-creams follows: State level volume the whether hypotheses or not the factor y volume. In A 5 manufacturer claims that the life expectancy of some used LED lamp is 30,000 hours. A random sample of lamps 29,500 is tested 28,350 and the 30,300 following 30,250 data is 29,350 the hypotheses and test at the 10% ever y making whether or not the manufacturer 29,600 easier to read. space is used expectancy Signicance In Example the illustrate how Example A group two days, iron we data to when is for how obtained use in the a to claims a In use condence testing pairs to inter vals study . compare Example 6 the data. same some countries, of a the computer languages, an is used. will at dr ug a A hospital and inuence of and suer dr ug B. dr ug A the from First, has results chronic they wor n (in g are o, / dl ) a b c d e iron deciency . treated they are with are given f dr ug treated in the g h 58 47 80 35 55 53 40 Dr ug B 63 55 42 76 29 6 5 4 the the dierences hypotheses dierence between between and the the test eects at results the of 0% these obtained by signicance two dr ug level A They A with and or treated after dr ug dr ug whether are and, following 60 State other instead to A b In longer Dr ug Find digits pairs matched signicance conducted Patient a places many lamps. matched studied patients dr ugs: test LED decimal with  of new the testing , compare of is signicance underscore life three numbers programming level comma obtained: comma. State the six for LED countries an B. A a with few ser um table: B; not there is a dr ugs. Chapter 3 109 Subtract d a = −3 A − B 3 5 4 6 −6 2 the results of drug B from the − i results H b : “There is no dierence in dr ug eect” Step of 1: drug State A. the null hypothesis that 0 conr ms ( μ = 0 d the hypothesis H : claim. State alter native ) “There is a dierence in dr ug f or a two-tail test. eect”  ( μ ≠ Method d 0 d ) I = 1.25, s = 4.200 d d t  μ 1   t 25  0   s 4 842 Step 2: Calculate = 1.895 Step 3: Find Step 4: Compare 0 the t-value. 200 d 58 n Either n = 8 ⇒ν = 7, α = 0.1 ⇒ t the t-critical value. c 0.842 < 1.895 ⇒ t < t the t-critical value with c the We have no sucient evidence to reject the calculated value and make a decision. null hypothesis. Or P (( −1.25 > d ) or (d μ > 1.25) = 0) Step 3: For the cor responding = 1 − P ( −1.25 < = 1 − P ( −0.842 0.428 > 0. reject the d < we null μ < 1.25 t < 0.842 have no = ) = at t-value nd the p-value. 0) 0.428 sucient hypothesis calculated the evidence 0% to signicance Step 4: Compare signicance level the p-value and make with a the decision. level. Method 1.1 II Use *Unsaved 1.2 “T itle” “ Alternate Hyp” “t” ≠ 7. “x” 1.25 Sn–1X” to the di erences. the di erence store data Apply the into lists t-test and with nd data in list. 0.427758 “df” := GDC µ0” 0.841726 “PVal” “SX “t Test” “µ a *Unsaved t Test 4.20034 µ0: “n” 0 8. List: Frequency List: | c 1 1/99 Alternate Hyp: Ha: µ ≠ µ0 ok 0.428 > 0. reject the we have no sucient evidence Cancel to 0/99 null hypothesis at the 0% signicance level. Compare level 110 Exploring statistical analysis methods and the p-value make a with decision. the signicance In Example contains could the the both not say eects more of The saw negative at this the acceptance and level two of we 90% positive dr ugs. and the condence values. condence However, determine We there inter val had was the concluded a signicance probability (in seconds) are that shown in it took the eight table players to A B C D E F G H 22 35 4 30 28 46 52 36 Cube 2 26 38 40 34 30 44 48 28 a 5% signicance between the nishing Six dar ts design. players Their are scores Player Dar t–old design Dar t–new Test at a kg, to old times testing (out the and Weight before Weight after at a 5% adver tising the that the we between gives dar ts 100) two that are dierent us mean lies in have given a in 92 00 97 89 9 90 95 98 99 93 92 at a his new design tness aqua club 30-day class class The Rubik’s not dierence there ight below . is a dierence dar t. claims aerobics programme. a of or a new table 85 whether is radical the F the two cubes. E level solve there D signicance is the not C after the on or B Student Test of whether A and taking join before level signicance trainer after decided in the personal weight design 0% between A that 4.064 ] below .  at dierence testing Cube Test 1.564, [ region. puzzles Player 3 that 3G times Cube 2 we information Exercise 1 , that classes. table members A group shows their will of 12 lose students weights, given period. A B C D E F G H I J K L 55 82 63 69 65 58 88 64 72 75 90 77 52 84 6 69 62 57 84 66 70 7 94 77 level whether or not the personal trainer’s fair. Chapter 3 111 3.4 In Type this make that in of the say We Type are II we discuss we can statements: that null to testing hypothesis as errors going statistical null false ● we them We ● and section ever y think I make have in fact tr ue a what or Type kind studied. have of errors Let’s two star t possible we by can noting values if we false I er ror when we fail to reject a hypothesis make a Type II er ror when we reject a tr ue null hypothesis The probability making A real-life example of these two types of errors can be seen in trial. Sometimes other times an innocent person is convicted, the area guilty person walks free. In a democracy , as presumed innocent at the beginning of a trial (our region by we say T ype II denoted Freeing a guilty person corresponds to a Type I Convicting an innocent person corresponds to a Type II Reality Null hypothesis is Decision Fail to based null hypothesis reject Null hypothesis tr ue Good −α I (Failing error to null Type hypothesis II error (Rejecting Good tr ue) reject α false) Reject is false Type decision collected data decision −β β In medicine, area, known as a that test the make, might by we wrong with a vir us health a of desirable will the If we of a On treated, Type statistical We I the A other nothing can have methods to a serious a Type to false the false with that is a of nd viral out a Type II er ror to symptoms, will consequences that test positive that health, they are better false negative their is means but the actually er ror lack this er ror positive means will II infection they of In ascer tain although patient wrong conclude type testing . viral but that due hand, is a negative which the test The have body, example er ror. analysis false whilst a vir us. patient’ s tests, statistical positive does eventually thus of perfor m wondering the it can we cer tain medical that lot patient different health. be are a a false vir us. in wor r y, assurance patient. the the this to results with that by as example presence from perfor m known For infected patient not than Exploring is vir us. their false is infected see with er ror shown viral the cause patient the not can combined is has shown infected I negative. patient actually hasn’t researcher s Type whether are 112 a false of making error by β. is nothing give and for a since the is less of 1 – β The is called error. the on is The error; value ● α. that: a ● error null probability hypothesis), I the ever yone denoted is of whilst rejection a Type a or cour t a of power of the test. Suppose we Our hypothesis die null roll is biased. Even though (p = the 0.026), A ● it Type (i.e. a the A (i.e. II the There are that null die of is not fair, roll and obtaining occur is conclude might when tr ue), that occur hypothesis and do no a six. the alter native sixes in 20 hypothesis rolls of the die is is that the fairly small possible. hypothesis and and the might error hypothesis Example times cer tainly null Type is error hypothesis ● 20 probability is I die is conclude the but the that we die when false), die but the the fair null biased. die we die actually reject is the is is indeed accept is the biased null fair.  two coins. One coin is fair and one coin is biased so that the probability of obtaining 2 a “head” with this coin is p . = W e take one of these two coins and ip it four times. 3 The the a variable X number Find The of denotes the “heads” on probability null hypothesis number the biased distribution is H of “the “heads” on the fair coin and the variable Y denotes coin. tables selected for coin both is coins. fair” whilst the alternative hypothesis H 0 selected four coin is W e decide that we are going to reject the null hypothesis Find the probability of obtaining a Type I c Find the probability of obtaining a Type II a Fair we “the obtain 1 ⎛ 4 ⎞ ⎜ ⎟ error; error. coin: Use X = x 0  2 3 4 1 4 6 4 1 the binomial PDF: 4 P P(X = Biased ( X = x ) = x) 16 16 16 16 16 0  2 3 4 1 8 24 32 16 ⎛ 4 ⎞ ⎜ ⎟⎜ x ⎝ ⎛ ⎝ ⎠ 1 2 = y the = binomial 81 81 81 P ( X = 4 H 0 ) ⎟ ⎜ ⎠ ⎝ P Type (Y ≠ 4 H 1 = 1 − ) I er ror hypothesis that for = ⎟ ⎠ 2 x 16 PDF: we could instance if ⎞ ⎜ ⎟⎜ y is ⎠ ⎛ 1 ⎝ 3 y ⎞ y ⎛ ⎟ ⎜ ⎠ ⎝ when we 2 y ⎞ = ⎟ 3 ⎠ ⎛ 4 ⎞ 2 ⎜ ⎟ ⎝ reject the y ⎠ 4 3 null have we when it is actually true. Type II er ror is when we f ail to reject the null 81 hypothesis event, ⎞ 65 = 81 Notice 1 = 16 = ⎛ 4 ⎝ 16 β y ) = 81 1 = = y) 81 α x ⎛ 4 P (Y P(Y x ⎞ coin: Use Y c if “heads”. b b is  biased”. decided obtain no to reject “heads”. the In null that when hypothesis case the it is for f alse. a dierent probability of 1 making a Type I error will remain the same P (X = 0 H ) = , but the probability 0 16 1 of making a Type II error will increase, P (Y ≠ 0 H ) = 1− 80 = 65 , > as expected. 1 81 81 81 Chapter 3 113 Since than test the it is probability in where better test the when that parameter hypothesis Find b It c If a we of α = II β P (X = II no hypothesis the null error is “heads”, when we ~ the Po(30) P (X and distribution all the for ≥ 36 the of the x a we is values Type actual hypothesis I test this greater less can after example conclude four heads obtaining 7 that is no the a heads. null than than or hypothesis 30. The equal against acceptance to the hypothesis region for the that null 35. of the parameter was 40. Find the probability of m Type I region and II to all errors. m = values What : X H = 30 ) m the x can ~ less you than or equal to 38, nd the conclude? Po (30) 0 = 30 ) 40 ) = = Use 0.157 H 0.242 a GDC : X ~ to nd the probability. Po (40) 1 ≥ 39 = 1 − P (X β ′ = are in error; value acceptance both ≤ 35 ≤ 35 α ′ = P (X We we obtain Use c smaller error; = 1 − P( X b X that expand probabilities a Type obtaining reject probability found Type of null we contains the was the a  assume a making instance reject than Example Let’s the we of P (X notice m ≤ 38 ≤ 38 that connected, m m = Type so a GDC to nd all the probabilities. = 30 ) = 30 ) 40 ) I = and when = Scratchpad 0.0648 0.416 Type we II errors decrease 1–poissCdf(30, 0, 35) 0.157383 poissCdf(40, 0, 35) 0.242414 1–poissCdf(30, 0, 38) 0.064844 poissCdf(40, 0, 38) 0.416024 a 4/99 Type I error we increase a Type II error. α Example > α ′ ⇒ β < β ′ 9 1 Let’s assume that X ∼ B and 100, we test this null hypothesis against the hypothesis 4 1 that the probability p ≠ . The acceptance region for the null hypothesis contains all the x 4 values that a Find b It are E(X ) within and the 8 of the expected probability of a value Type I of X error; 2 was found that the actual value of the probability was p = . Find the probability of 5 a c If Type we 114 error; expand within What II 6 of can Exploring the the you acceptance expected region value, conclude? statistical analysis methods nd of the the null hypothesis probabilities for to both all the x Type I values and II that are errors. 1 E a ( ) X = 100 × Apply = 25 the f or mula f or binomial 4 distribution E(X) = np 1 α = P ( X ≤ 16 or ) (X ≥ 34 p ) = 1 4 H : X ∼ B 100 , 0 4 1 ⎛ = 1 − P 17 ≤ X ≤ 33 p = ⎜ = P Use a GDC H X to nd 17 ≤ X ≤ 33 p probability. 2 = = ⎜ the ⎠ 2 ⎞ ⎛ β 0.0487 ⎟ 4 ⎝ b ⎞ = 0.0913 ⎟ : ∼ B 100 , 1 5 ⎝ ⎠ 5 Use 1 c α ′ = P ( X ≤ 18 ) or ( X ≥ 32 p ) a GDC to nd all the probabilities. = 4 Scratchpad 1 ⎛ = 1 − P 19 ≤ X ≤ 31 p = ⎜ 19 ≤ X ≤ 31 p = 5 0.091253 0.13236 0.0398 ⎟ ⎝ 0.048705 binomCdf(100, 0.4, 17, 33) 1–binomCdf(100, 0.25, 19, 31) = ⎜ 1–binomCdf(100, 0.25, 17, 33) ⎠ 2 ⎞ ⎛ P 0.132 ⎟ 4 ⎝ β ′ = ⎞ = binomCdf(100, 0.4, 19, 31) 0.039846 ⎠ 4/99 Again errors we notice are that Type connected, but I and this Type time II when we α increase Type The II a Type I error we decrease < α ′ ⇒ β > β ′ a error. calculations we performed in Examples 8 & 9 can be seen Despite visually in the following two normal distribution the not being normal Examples Normal distribution graph for a one-tail distributions graphs. can test still both 18 & in 19, we approximate binomial and y H Poisson H 0 distributions by 1 a Critical normal distribution. value x 0 1 2 3 β The ● diagram to the demonstrates right, increase ● to the α Type left, decrease we we that decrease II II moving Type I error the critical but at the value same ver tical time line we error; increase Type by Type I error but at the same time we error. Chapter 3 115 Two-tail test y H H 0 1 Critical value Critical value 0 x 1 α α β 2 Again, this vertical critical that diagram lines value the 2 demonstrates simultaneously line critical to the values left, are that (notice the by that other moving when one symmetric is about the we move moved ) H critical the to one the value vertical right, following so occurs: 0 ● When Type ● II When Type Exercise 1 We we Type I error, we simultaneously increase error; we II decrease increase Type I error, we simultaneously decrease error. 3H believe that a normal random variable, X, is distributed 2 such that X hypothesis null ~ N(5, that hypothesis a Find b It the was Find 0.4 the is ). mean {4.2 ≤ probability found the We that X of the probability μ test ≠ ≤ 5. The null hypothesis acceptance against region for the the 5.8} Type actual of this I error. mean Type II value was μ = 4.5. error. 1 2 Let’s assume that X ∼ B and 50, we test this null 2 1 hypothesis against the hypothesis that the probability p . > 2 The acceptance a Find b It E(X ) was region and found the that for the null probability the actual hypothesis of Type value of I the is {X ≤ 30}. error. probability 4 was p . = Find the probability of Type II error. 7 3 Let’s assume against less is { the than X X ∼ Po hypothesis 45. The ( that 45 ) the acceptance and we test parameter region for this of the null the null hypothesis distribution is hypothesis } . ≤ 52 a Find b It the was Find 116 that Exploring probability found the that the probability statistical of actual of analysis Type I error. value Type methods II of the error. parameter was 40. Review exercise EXAM-STYLE 1 A box of a of 20 salmon salmon standard Find a QUESTIONS was deviation the 95% produced Find b the at Comment c 2 In a by medical two level The in table mg/dl, of condence he a sh and farm. the produces The farm is mean owner weight states that the 114 g. inter val for the mean weight of salmon inter val for the mean weight of salmon farm. the signicance the level biochemical blood 10 from 652 g level and the width of the inter val. below for be salmon laborator y types the to farm. the on condence of the at bought condence 99% produced is found of a lists potassium analyzers. healthy the of person measured The is in range between levels blood of of is the 270 potassium and potassium, measured 390 given mg/dl. in patients. Patient A B C D E F G H I J Analyzer I 235 352 40 280 34 325 428 388 272 30 Analyzer II 237 343 46 272 336 329 43 396 265 35 Find a the dierences between the results obtained by the two analyzers. State b or the not hypothesis there is biochemical 3 Fifteen a and test dierence in at a 1% signicance measurement of level the whether two types of analyzers. independent obser vations of a random sample are 15 taken from a normal 15 2 population. The sample gave the results ∑ x = 80 and ∑ i i =1 Calculate a given 4 Find c Inter pret be unbiased estimates of the = 488. i i =1 mean and variance for the obser vations. b The the x a 99% condence the meaning measurement assumed to of six follow a inter val of the for independent normal the population condence inter val random distribution mean. at the given measurements with the mean level. may value 2 μ and for the the variance mean is σ found a the mean value b the condence of = to 25. be the level of Given [47.2, that 55.2], the condence inter val nd: sample; the inter val. Chapter 3 117 5 An automotive their speedometer driving. tested set at If a A on the par t 1 claim 31.3, auto State The exact of straight tr ue, ten 1 km speed cars company at was racing which taken track claims the and with car they the that is were autopilot to seconds 31.4, would were 30.9, unbiased claims up the hypothesis measurement assumed long it take each car to travel and estimate as follows: 31.2, 30.8, 30.4, 30.5. of the mean and variance of the times. sets the in 30.3, the how track? magazine deliberately 6 is 32.1, measured c a the sample measured Calculate An of kilometer times 30.8, b shows random manufacturing 120 km/h. the The instr ument follow of a n that due to speedometers and test the to reasons show claim independent normal safety at random distribution a the higher the 5% company speed. signicance measurements with the unbiased may level. be estimate 2 of population for 7 the mean is a the mean b the sample A radar bicycle The found value size records lane. results s = to be variance [204, 216] that the 95% condence inter val nd: sample; speed, speed 150 Speed the Given n the The for of 144. of v, kilometres these bicycles Number in are bicycles recorded is in per hour, of normally the bicycles on distributed. following table. of bicycles 0 ≤ v <0 9 0 ≤ v <20 56 20 ≤ v <30 47 30 ≤ v <40 25 40 ≤ v <50 3 a For i ii b c bicycles on the bicycle lane, calculate an unbiased estimate of the mean an unbiased estimate of the standard For the bicycles on the bicycle lane, speed; deviation a 95% condence inter val for the mean speed; ii a 90% condence inter val for the mean speed. Explain the Exploring why one of the inter vals other. statistical analysis methods of the speed. calculate i of 118 the found in par t b is a subset a 8 A population follows a normal distribution with the following 2 parameters following H : = , N ( = 2). Let’s assume μ that = 10 to test the hypotheses: 10 0 H : < 10, 1 using a the Find level If of the ii actual a 0.1; Explain the a sample appropriate making i c the of of size critical 5. regions corresponding to a signicance of 0.1; i b mean population Type ii the change 0.05. II error is when 9.3, the calculate level of the probability signicance is 0.05. change in mean the in the probability probability of a of Type a II Type I error related to error. Chapter 3 119 Chapter Estimator A if random E(T A ) and variable summary estimate T is called an unbiased estimator for the population θ parameter θ = specic Given  value two of that estimators random T and T  estimator than if T variable of the is called an estimate population we say that T 2 Var (T 2 ) < is a more ecient  Var(T  ). 2 2 n n X i X = ( X ∑ i =1 is unbiased estimator for μ. S X ) i 2 = 2 is ∑ n unbiased estimator for σ 1 n i =1 n 2 ∑ n( x ) i n 2 S 2 x 2 2 i =1 = or n s σ = 1 n Condence interval 1 for mean μ of population ⎡ i) When the population standard deviation σ is known ⎢ σ x − the population deviation σ standard is where t = invt Calculate for the the mean 2 dierences Hypothesis There i) are two are four test of upper lower for matched between : μ ≠ State 2 Set Step 3 Calculate Step 4 Decide use We use the the test ( H the obser vations testing : μ > x ) : μ < x ) null statistical and the upon in in for and then nd the and three types of the ⎤ × t c n condence alter native testing: alter native a hypothesis; decision; statistics; calculated for μ when hypothesis testing t-statistics hypothesis criteria testing z-statistics Exploring + ⎥ ⎦ pairs x ) test ( H in 1 Hypothesis 120 steps Step We x ⎠ hypothesis (H Step Hypothesis , n 1 There t inter val testing Two-tail One-tail s × c 1 iii) ⎦ dierences. types One-tail ⎥ 2 n − 1 ii) α ⎟ interval of z ,ν 1 − ⎝ Condence × ⎞ ⎜ c + s x unknown ⎣ α x 2 ⎢ ⎛ , n ⎡ When z α ⎣ ii) ⎤ σ × for μ analysis is and decision criteria. known testing. when hypothesis statistics is testing methods unknown regardless the sample size. hypotheses: Signicance We calculate hypothesis Type I testing the Type ● We say ● We make matched dierences testing and for that a between regardless II the pairs the sample obser vations and then use t-statistics in size. errors we make Type II a Type er ror I er ror when we when reject we a fail tr ue to null reject a false null hypothesis hypothesis Reality Null is Decision Fail to based null hypothesis on reject hypothesis tr ue Good  Null − is hypothesis false Type decision α I error (Failing to reject false) collected data Reject null Type hypothesis II error (Rejecting Good tr ue)  − decision β β One-tail test y H H 0 1 Critical value x 0 1 2 3 β Two-tail α test y H H 0 1 Critical value Critical value x 0 1 α α β 2 2 Chapter 3 121 Statistical 4 CHAPTER OBJECTIVES: Introduction 7.7 moment ρ and = and the Y. Its = y) Expectation two − value Y ) ρ = and of of a Y given ρ; linear the use the of scatter = of the x) are of ρ; facts 2 Find of a e.g. of − the random 2E(X ) = 2 = 4Var(X ) data nd the mean set − the and following use that of to denition the t-statistic regression of to of X the on r, test the on estimates predict of obser vations interpretation least-squares lines Y; and paired the independence value of of these one e.g. If X and variables, Y are 1 + + If nd: of and Z E(2Z − c E(XYZ ) a Using are three normal 3Y + standard random 2X ) variables, Var(2Z b 3Y + 2X ) Var(Y ). X Y 5.7 −2. 2 7. −3.3 2.3 0.9 3.9 .2 variance X Y 33 −2 49 −44 table. 50 −23 Using 42 −39 X 2 ∑ − nd: Var(Y ) n i Y, a variables: x X, independent the formulae, nd formulae, and n ∑ n product of other . variance the mean of case X informal linear ; E(Y ); Var(X ) using (population) the start = Y ) in between terms regression the and 0 in 2 Var(2X = diagrams; of these value R estimation (E(Y )|X ρ that relationship coefcient and X covariance proof knowledge on the to R 0; Algebra, independent E(2X case correlation lines; you distributions; coefcient application variables Before Use the hypothesis (E(X )|Y the bivariate moment regression 1 in obser ved null Y to correlation ±1 product X modeling (x b and Y, a as shown GDC, in the conrm x ) i 2 n x i =1 x i =1 = ; Var ( X ) 2 i = = n n − ∑ i =1 your On 5.7 + 7.1 + 2 .3 + 3.9 x 4 = = ∑ x 5 .7 a lists, 2 2 + 7 .1 2 + par t a GDC, you can enter the data in two different and then select from the Stats calculation 2 2 .3 + menu, 3 .9 = = 25 ‘2-Variable Statistics’, and see the values 85 for n for 75 4 2 values x n both variables on the same screen. 4 2 Var ( X ) = 25 .85  4 .75 = 1 .81 3 Similarly , y = −0.825, Var(Y ) = Given GDC 3 Use e.g. the a c 122 the t-statistic given that following P( X ≤ 1.2 ) P( 0.3 Statistical ≤ X X to ~ calculate t (v = 6), that X ~ t (v) use the .93 to nd the following probabilities probabilities, use the GDC to a = 2, P( −0.4 b = 10, ≤ X ≤ 0.8) ; nd P( X ≤ 1.83); probabilities: = 0.862 ≤ 2.5) modeling = b P( X 0.364 ≥ − 0.52 ) = c = 7, d = P ( −1.75 ≤ X ) ; 0.689 20, P ( −1.14 ≤ X ≤ 1.14 ) if: Bivariate Today it cancer. males years about seen is In % generally USA, about lung of all in Initially air was it A and However, industrialization fact estimated females was 20th a that that are tumors but smoking 90% due relatively centur y , slower to rare seen in of lung disease, to lung deaths However, comprising autopsies. steady lead cancer smoking. par ticularly equally can after An in only increase World increase in 50 War was I, and female obser ved. thought pollution automobile), cancer. the also was is in malignant males. incidence accepted it 75% cancer throughout mostly war, a the and ago distributions that caused other by also to increasing such countries were exposure factors with toxic showing causing eects an released industrialization were few gases from increase the (including increase either in during lung the in war the the lung or cancer incidence. In attempting disease, in the a use smokers of and connection In this two to factor tobacco. tobacco without chapter variables, between discover common we in them. cause strong to will be not also a public, ready to increase in worldwide and in accept the increase par ticular such a evidence. ways determine also worldwide was general were scientic examine this there the companies, order of emerged: However, will We the of the studying handling nature of measures data the of associated with relationship the degree to Chapter 4 123 which a change variables and two are We which or We will their We on are one the the begin same and are and which Each data by considering or the experimental, The Y , going to consider population. n paired and the individual example, perform versa. two We and a is Each also two of as the the at other, other decreases (i.e. the whether measure variables are two increases, as and i.e. the other correlation they of are correlation positively or of two variables using values. two of the let by obser vation a will us of random these of X points The the in the each table X-Mathematics Y-Physics Physics scores and (x , y the to in below held ten examine the table score in joint a a of distribution, scatter diagram. y-value, the assumption well the in in hence that the a relationship below we will corresponding students Physics, both Mathematics and vice Mathematics between draw a against these scatter their score values. A B C D E F G H I J 4 37 38 39 49 47 42 34 36 48 36 20 3 24 37 35 42 26 27 29 graphically students scisyhP 20 10 x 0 30 Mathematics 40 displays recorded 30 modeling and students 40 Statistical measured distribution the called perform 50 20 is a Y, i y 10 ) form an x-value also of shows diagram ten Y has and obser vations. scores data variables, X variables widely student’s score scatter of the order score following contains paired Mathematics test From plotting n consider in compare Physics Physics. set Student 124 in covariance relationship obser vations graph well variables. graph The of i obtained who test one related i For if measures related) to increases related variables degree change related. now i in at two the if a Correlation values. X related eects negatively negatively obser ved, 4.1 variable look how describes negatively are will indicate positively one positively variables increases. which in 50 in the the table Mathematics above. and Horizontal and Mathematics creating As the four of and correspond indicates follow that the gradient. there in are cross (3) at have in this a the been mean score added say trend of therefore to for the quadrants that ‘above ‘above in Mathematics in these average’ Physics. Physics points, that us st relationship and the and to scores positive Mathematics 3rd shows scores average’ is the correspond average’ ‘below general We that system, usually ‘below to performance lines Physics points coordinate Mathematics Physics, and diagram, quadrants. majority translated in ver tical (4) it two If scores a new scores in a therefore student’s we have variables average’ diagram between would this usually The exams. of allow a line to positive have a positive cor relation Let us test and now a consider Histor y the test, scores and Student K X-Mathematics Y-Histor y The ten score score following students scatter in a L ten students in corresponding M both a scatter N O P Mathematics diagram. Q R S T 4 22 4 45 22 34 5 27 7 26 54 7 0 20 26 2 35 5 40 25 diagram recorded for draw the graphically table displays the scores of the above. y 50 40 30 y rotsiH 20 10 x 0 10 20 30 40 50 Mathematics Again, mean a vertical point quadrants. usually This and 25). scores horizontal Here shows correspond average’ in (2, in to us the that ‘below Histor y line are majority ‘above passing points average’ average’ usually shown of scores scores in correspond are in in through the ‘below the and 4th Mathematics Histor y , to 2nd and ‘above average’ scores Mathematics. Chapter 4 125 The diagram between with a student’s histor y trend of the therefore Below is points, that the these table showing exams. it Y-Photography that values and this in we have a for there a a line negative have ten is negative Mathematics allow a variables to students’ exams compared the We general say cor relation scores exam, relationship follow gradient. a negative Photography in a followed by a scatter data. Student X-Mathematics If would two of exam indicates performance Histor y Mathematics diagram therefore score score K L M N O P Q R S T 20 22 30 45 22 45 5 27 45 26 20 48 0 20 2 38 50 35 8 25 y 50 40 yhpargotohP 30 20 10 0 x 10 20 30 40 50 Mathematics Here a we cannot student’s in test no of the It is points. clear in positive Mathematics dicult We or to see therefore a say negative relationship compared line that with their between test following the these variables two score general have cor relation. The rst two variables, coecient, two To they We indicated did not therefore which a possible reliably need assesses the to tell us correlation the calculate degree of between strength of two the the cor relation the correlation between the variables. do this, we correlation, the graphs but correlation. 126 a score Photography . trend see mean will but scores Statistical redraw this i.e. modeling time the the the scatter axes origin diagrams will will be be the the that lines mean showed drawn point. through This new scatter diagram shows the deviation d = x x − (the horizontal x distance to the Mathematics vertical score axis) and the of the Mathematics = deviation d y y − score (the from the vertical mean distance to y the horizontal For student axis) A, of the therefore, Physics d = 4 score − 4 = from 0, the and d x hence, student We this do for A is of the = 36 Physics − 3 = score. +5; y represented each mean on the students A graph by through the point (0, 5). J. y G E scisyhP A C F (x,y) J I H D B Mathematics We have already correlation quadrants. the origin, seen exists Since note if in the most we that rst of have x the two redrawn most of scatter points the are the diagram d products d x i.e. in the st quadrant both and d d x product will be positive, and diagrams either are in with will that the st the be a positive and mean negative, Likewise, points we are products, d d 3rd more half or positive, positive, and so their y in the 3rd quadrant, and both d d of , the seen in will their the product that 2nd a also negative and therefore will be 4th be y positive. correlation quadrants, exists and if most most of of the the negative. y scatter less so have either x The and as y x are 3rd diagram uniformly products d x showed no distributed d will be correlation, in all 4 positive, since quadrants. and about the In points this half are case, will about be y negative. The sum of these the cor relation the size of To the of the products of coecient. correlation deviations The only coecient from issue the with depends means this on is the technique the units of basis is of that measure variables. circumvent deviations this from problem, the mean, we and take the divide by sum the of the square products root of of the the Chapter 4 127 product of the coecient scales. does of becomes Using the sum one trick Denition: The squares independent of and the is the above of value you the of deviations. variable’s sets, of r the the data independent obser ved of can units the of In doing so, the measurement conrm that this measure! sample linear correlation n ∑ (x − x )( y i − y ) i ∑ i =1 coecient is dened as r = n (x − of r r near 0 indicate nearer ± a ∑ a ( y − d x 2 ∑ d y y ) i i =1 weak indicate ∑ 2 x ) i i =1 of y 2 2 values d x n ∑ Values d = correlation strong between X positive or strong and Y, and negative correlation. Correlation It is impor tant correlation be as one war y If there other variable to i.e. factors statistician, the chapter. 73 invited to disaster be booster that (O-rings help solid in rocket the and In an hot at of the the erosion Example  It was is time of was below , modeling on The the known found we r. to will the be of use after from After other. the job of specialist discussed or the in later in died, to of solid the be the rocket rockets of the leading factor temperature this disaster , correlation, between fundamental a the just ever cause a in as coincidence, is the must watched right low of we variables change mere Y. dependence, two fur ther segments that launch. be members escaping or be teacher team. a job and However, correlation the apar t crew rst exceptionally the is people different relationship, coecient, Statistical the now the of O-ring of other. might will break seven gases joints this imply between Finding millions an to variable causes This astronaut of the between variables, X appears causation All distinction random does work. McAuliffe, the failure one Challenger failure seal 31° F) correlation 128 of so the correlation study . ight. boosters). strength O-ring of prevents O-ring (about the par t the at 1986, the two causation, be Shuttle discuss variables in asser ting Christa was be eld into them to 28 of the change Januar y seconds among a Space two to changes, could respective point inter pret but the the this causation appears On causation between not causation, at and Correlation i.e. and GDC to temperature impor tance calculate during the at the linear the time launch of the phase of launch a space shuttle. Example The  following USA and the table shows incidence State X- of the prevalence of smoking, smoking-attributable Prevalence Alaska of as a percent, in some Smoking Y- Smoking attributable 34 2 8 D.C. 26 4 Florida 22 20 Georgia 5 4 Indiana 44 44 Kentucky 50 5 Missouri 38 45 New 2 7 28 27 22 28 Y ork Rhode I. Texas Utah Data  taken from: of the deaths. 42 Califor nia states death rate  http://www .cdc.gov/tobacco/data_statistics/state_data/data_ highlights/2006/pdf s/datahighlights06table5.pdf a Draw with a scatter origin b Discuss c Use a at the GDC diagram the correlation to to mean illustrate this data. Include between calculate X and Find *Unsaved 1.5 point, and draw an axis r – 1.4 mid Y a 1.3 the point. y ( x , points, – y ) and including use the the GDC mean to plot the point. 50 40 etar 30 htaed 20 10 0 x 10 20 smoking 30 40 50 prevalence Chapter 4 129 b There is and since st Y and a positive 3rd most correlation of the Deter mine between X points are in after the Use 1.2 drawing quadrants axes where through the *stats 1.3 a GDC to calculate the coecient, r. =LinRegMx(’xcoord, ’y = 2 2 18 Reg... 3 26 14 m 0.7377 4 22 20 b 9.7671 5 15 41 r 0.62231 6 44 44 r 0.78887 m*x ≠ b 2 44 From the variables In the to make entire using what have is far, 0.789, we about (in have the order by hence considered or the relationship experimental, correlation to the the correlation. obser ved, meant create values. between of we are, measurements mathematical correlation If between models), two random then variables, two however, in an we X need and to Y distributions using n probably there = positive their population, Since so r considered r, coecient, will a inferences Sampling the have population dene W e GDC, examples variables the paired we give are must a observed value observations. analyze the of In the sample order to distribution of linear consider each r, correlation the correlation where each of sample dierent r-value. many possible samples x , the distribution of random variables i X would be the same as the distribution of the random variable X, and i similarly the distribution of random variables Y would be the same as the i distribution for all the the possible same random sample way , sampling all coecient can ρ sample be variable Y. sizes values correlation coecient The of of will r as an product (x, y) moment on X and (X − X )(Y i −Y ) i . = n n 2 ∑ (X − i i =1 130 Statistical sampling from of the X ) 2 ∑ i =1 modeling (Y i −Y ) cor relation Y, i =1 R a sample all distribution R. estimate n ∑ follow calculated coecient used The means, x and distribution, possible Each X samples sample population y , calculated and Y will correlation correlation is coecient R, for n . In form (rho). obser vations mean linear gdc A A6 most points lie point. quadrants. c 1.1 the paired a cor relation If we take the numerator of R, n obtain (X − X )(Y i −Y ) = and Y X i are constants, n (X Y i ) − X Y −Y X brackets to XY − X Y i + XY ). i expression is equal to ∑ XY i =1 Y = , Y n , = n ∑ i i =1 therefore, n n X i i =1 n i + i ∑ i i =1 ∑ the n ∑ ) − X i =1 n (X Y out n ∑ i i =1 X − i this n ∑ i i =1 Since multiple i =1 n ∑ can (X Y i i =1 Since we n Y −Y ∑ i i =1 n X + i i =1 n ∑ XY = (X Y ∑ i =1 i i ) − 2 nXY + nXY i =1 n = (X Y ∑ i ) − nXY . i i =1 n n (X Hence, − X )(Y i −Y ) = (X Y i i i =1 ) − nXY , and an alter native i i =1 n X ∑ Y i nXY i i =1 formula for R is therefore R = n 2 ∑ (X 2 − nX 2 2 )(Y i − nY ) i i =1 We can classify the strength of the correlation between the random As variables X and Y by using the following general for the ● ±  indicates a perfect positive/negative 0.5 ≤ ● − ≤ R R < ≤ R  indicates −0.5 < 0.5 a strong indicates ● 0. ≤ indicates ● −0.5 < R ≤ −0. ● −0. < R < 0. a a positive strong weak values indicates a a negative positive weak highly earlier of of R r, a near 0 weak correlation. correlation between and values X correlation. Y, and of R correlation. nearer indicates values correlation. indicate ● stated classication: negative weak correlation. correlation, or strong ±1 negative no indicate positive a or correlation. correlation. Of course, they could if = have sinusoidal, this R etc. a 0, X and dierent These Y may not have correlation, kinds of e.g. correlation, a linear correlation, quadratic, however, but exponential, are not par t of course. Chapter 4 131 Example Using the calculate X  table the (percentage retur ns rate). The below , product of Check of of this and stock percentage X % economic Standard shares draw scatter 500 would economic growth by 500 rate) using index publicly have diagram, correlation result Poor’ s of you a moment is earned use one GDC stock to you Y % nd r, mar ket invested of S&P 500 .9 7.7 2.6 2.6 3.9 3.7 3.2 9.8 the formulae the two Standard inter pret based S&P your of and index The of between (percentage companies. had growth and Y a a traded and coecient, R, 500 money on the retur ns in all and your total rate 500 above random Poor’s 500 result. value is to variables of issued the companies. retur ns Make sure that *Unsaved 1.1 the y mean point is included. 20 label (3.9, 13.7) 15 (2.6, 12.6) Economic Growth 10 (2.9, 10.95) (3.2, 9.8) (1.9, 7.7) 5 S&P Returns x 0.5 Method 1 1.5 2 2.5 3 3.5 4 I n [( x − x )( y − y )] = (.9 – 2.9)(7.7 – 0.95) + (2.6 – 2.9)(2.6 – 0.95) i =1 + f or (3.9 – 2.9)(3.7 – 0.95) + (3.2 – 2.9)(9.8 – 0.95) = 5.6 n 2 (x x ) ( y y ) 2 = (.9 – 2.9) 2 + (2.6 – 2.9) 2 + (3.9 – 2.9) 2 + i =1 n 2 2 = (7.7 – + (3.7 2 0.95) + (2.6 – 0.95) – 0.95) i =1 2 – 0.95) 5.16 r ≈ 2.18 × 22.17 132 Using Statistical modeling 0.742 2 + (9.8 = 22.7 (3.2 – 2.9) = 2.8 r, r the = f or mula 0.742. Method II (You A B C D will notice E that the gives = TwoVar( GDC you all the = values necessar y 2.9 2 2.6 12.6 x 3 3.9 13.7 Σx 4 3.2 9.8 Σx to evaluate r 11.6 manually.) 2 5 sx 6 σx 35.82 : = : sn–... 0.852447 σn... 0.738241 = n 7 4. 8 10.95 y 9 Σy 10 Σy 11 sy 12 σy 13 Σxy 14 r 43.8 2 D26 r = = 501.78 : = : = sn–... 2.71846 σn... 2.35425 132.18 0.74223 22.17 0.742 There is a strong positive correlation between the two random variables. Inter pret the r value. Although the it mathematician He borrowed ‘Pearson method linear ly the Any For related. at assume other that the The both is Correlation founded College UK the data in in Y the wor ld’ s be mental work be are rst other; here, and by For is there an we a third concept. the name most the common variables that are statistics or R great was can be used care. seen increase would there so the developed coefcient. with It this two Either r correlation between aected It rst English hence university 1911. 930s who the formalized between illness. the related. in was Physics, treated early of Bravais it Coefcient’. moment the who from London must cause may and 1800s, coefcient relationship at Auguste mid (1857–1936) product factors X the cor relation increase one physicist in ‘moments’ coecient in unrelated which a sample an of Moment signicant and cor relation. Pear son Pear son correlation example, French concept Univer sity Pearson licenses the coefcient computing statistically to Kar l the Product of depar tment for was correlation are call be in this or there be a radio absurd probably factor, example, to to many a spurious more f actors, could be a Chapter 4 133 correlation could both actually between be be smoking related causing to a both and third the high blood variable high blood pressure, (stress level) pressure however which and the they could desire to smoke. Hence, great change in caution one factors should This why , is statistician, their For the point be as The of sets repor t 5 change and in other nding causation a before the other. statistical be left Many tests correlation should asser ting is to that a other performed. the the job of the specialists below , Y being by of using ten with (Y ), are the draw the a a how a in highest and were daily informed inter pret sought of table including for the this world below , (X ), with 1 mean value. their and aairs. the random regarding newspaper them the diagram, Calculate R GDC, people shown scatter point. a television Their being levels levels the of lowest satisfaction.  3 2  2 5 4 4 2 Y   2 4 3 4 2    The lengths given Y The are be in in mm the (X ) table and widths in mm (Y ) of cer tain leaves below . 00 5 40 49 88 32 52 44 2 28 33 38 40 5 36 40 5 43 32 42 percentages shown the in study . 4 X 134 a taken X are 3 of through and satisfaction 2 elds satisfaction and earlier, asser ting opinions news causes be 4A axis X always considered, stated but data and variables 1 variable respective Exercise should scored below . percentage Let X in the be scored in same the subject percentage Test on two scored dierent in Test 1, tests and Y 2. X 55 35 66 82 9 79 48 52 7 88 Y 58 65 52 35 36 42 60 55 50 38 Statistical modeling 4.2 Covariance Another statistical between two implies, the measure random used variables covariance tells us to is determine called whether the or if there is covariance. not the a relationship As the name variables var y together. A ● positive one the A ● variable other the Zero In other of this formula tend covariance words, for paired to with indicates be paired by each higher than than that higher with lower the and that sum i.e. of the is thus the no the of the of of dividing obtain − X )(Y i r, the sample of the dierent two in unit random this issue by deviations that is The data. −Y used. )] For X the and individual take the values’ average of the Cov (X, Y ) XY = XY dimensionless, this and value the when covariance This a to is same we the quantity facilitates it dependent represents the correlation creating units. that indicate reason, Y, the a not i.e. have a if distribution coecient product of between the strong relationship a on r of addresses the − a standard and comparison + of sets. covariance relationship, between numerator ∑ means variables, of not might normalizing the is This unit, variables data of of n however, independent dierent values values i =1 one is of of covariance: i or covariance, units of n [( X = relationship average average n The values values correlation products numerator equivalent we there i =1 Y ) average than than n ∑ Cov(X, average variable. the mean, is n indicates nd the This r of we from sum. be that higher variable. variance deviations to covariance variable other the tend indicates variable. negative one ● covariance indicates but it does whether not tell there us is a anything positive about or the negative strength of the relationship. If we use the rst denition given for R, n ∑ (X − X )(Y i −Y ) i i =1 i.e. R = n n ⎛ and E(X ∑ ⎝ i =1 divide ) 2 ⎜ and the 2 ⎛ ⎜ ∑ ⎠ ⎝ i =1 2 numerator these E[( X ⎞ ⎟ − nX i replace Y Replacing ρ X by by 2 ⎞ − nY ⎟ i ⎠ and E(Y averages − E[ X ])(Y Y ), denominator the the sums by n become expected and replace X by averages. values, we obtain, − E[Y ])] = . 2 E[( X − E[ X ]) 2 ]E[(Y − E[ Y ]) ] Chapter 4 135 The like denominator a variance, rather than a covariance equal except square of population is X of and product it ρ Cov(X, Denition: is the a For random measure deviations Cov(X, product Y ), and and in The of deviation fact the numerator is called greek letter of X Y each of the rho, ρ, = set of their E[(X n paired and Y value − E(X ))(Y of of respective = E(X μ ), with Y ) = = observations, a joint the E(Y the covariance distribution, product means, − = E(Y of the Cov(X, two = this denition E[(X − )(Y we − E[XY − E(XY ) = − variables’ i.e. ))] = E(XY ) E[(X − μ )(Y μ − = E(XY ) μ Y μ ) − μ + μ Properties of Symmetr y: proof two If E(XY ) X − = this E(X Y E(Y = Y is are )E(Y μ x The it a the ) ) = left = second E(XY ) = throw and Students should Y above Theorem proof : is Statistical If left μ μ y μ y x y the a ); as are X ), and Var(X ) = Cov(X, ) = 0. X ) exercise. X and Y, Cov(X, variables, Cov (X, Y Y then ) = E(XY ) μ − y x y 0. is X not and ‘the die’. not and an modeling necessarily Y are If = if U+V Cov (X, Y ‘the and ) = Cov(X, For on Y 0, = ) = example, the number Y rst 0, let throw appearing U−V, however U of on then the random independent. the Y appearing variable X tr ue: independent. number random above proper ties, as an and μ hence, verify X Cov(Y, y of E(X )E(Y the = that be X The μ variable V variables From 136 let + x + μ x events μ however, follow and μ independent x random die’, the not μ y converse, does be − ) y x μ y μ independent and one. covariance Cov(X, of alter native ] μ x x μ − an x x Proof: )], y )] μ E(X μ − derive x y For ), y X μ − can y  Y ) y The the y x  is coecient. ) mean x Cov(X, Y, ) variables X from ) μ Star ting and the x where looks = Var ( X ) Var(Y of a deviation, moment Cov ( X ,Y Hence, . Var[ X ] Var[ Y ] contains a Y, to we are result can as establish independent exercise. an exercise. the following variables, then ρ theorems: = 0. Theorem 2: If X and Y have a linear relationship, i.e. Y = mX + c, then ρ = ±. Proof: Y = mX + c ⇒ E(Y ) = mE(X ) + c. Hence, 2 = m μ y + c ; Var(Y ) = m Var(X ). x Therefore, 2 E(XY ) = 2 E(mX + cX ) = mE(X ) c μ + x and, using the formula for ρ and making 2 substitutions 2 ) + c μ m E( X − μ x ρ the (m μ x + c ) 2 ) + c μ m E( X x − (m μ x = above, − u x c ) x = 2 Var ( X )m 2 2 Var ( X ) m 2 ( Var ( X )) 2 m ( E( X μ ) ) mVar ( X ) m x = = m Var ( X ) Practically ρ is R, the Y, sample = Var ( X ) we coecient Example Given m speaking, correlation for = can of a ± 1 m estimate ρ sample product for drawn moment the population from the only population. by using Hence, the an estimate coecient.  that Cov(X, X Y ~ ), N(0, is ) and Y = 2X, show that the covariance of the joint distribution X and 2. 2 Cov (X, Y ) = E(XY ) 2 E(2X – E(X )E(Y ) = E (2X ) – E(X )E(2X ) Use denition. Use proper ties 2 ) = 2E(X ) 2 Var(X ) = E(X 2 ) – [E(X )] 2 ⇒ E(X 2 ) = Var(X ) + of expectation algebra. [E(X )] Hence, 2 E(X ) = + 0 =  Cov(X, Y ) Substitute 2 Hence, = E(2X = 2E(X ) – E(X )E(2X ) ) – E(X )2E(X ) 2 Exercise Prove the Y ) = Cov (Y, 2 Cov (X, X ) = Var (X ) 3 Cov (aX, 4 Cov ( X, 5 Cov ( X Y ) bY + = ) , Y + If X = 2 Var(X ) = 1. Evaluate. + and Y Y ) ) = ) Cov ( X , Y ) = = are + Cov ( X Var(X ) , Y ) + Cov ( X, Y 1 + Var(Y independent ) 2 Cov ( X , Y 2 ) ) 1 Y 1 Y b Cov ( X , Y 2 Cov ( X , Y Var(X 0 and X ) aCov (X, = X 1 8 – 0 following: Cov (X, 7 2 = 4B 1 6 = E(X ) ) 2 ) + 2Cov (X, variables, then Y ) ρ = 0. Chapter 4 137 4.3 Hypothesis testing Introduction This introduction distribution, Before be in two we can order to random sample Y ou are single not We already be understanding f or mally how that X large Y, rst, ∼ N ( μ Then, the ver y familiar signicant will random two for with the variable. two perform dene nor mal coecient correlation an must between appropriate what normally normal We random will we mean distribution now variables X distributed ) and Y ∼ ( μ N , σ Y bivariate is normal dened by by consider and variables, for the a joint Y i.e. ) Y distribution, the ⎡ 1 ⎢ − x −μ x 2 ) ⎣ σ ⎝ y −μ y + ⎜ normal function: ⎠ σ ⎝ ⎢ ⎣ ⎤ ⎞ ⎛ − 2 ρ ⎟ ⎜ x joint 2 ⎛ ⎞ ⎟ ⎢ ρ 2 (1 ⎢ the density 2 ⎛ ⎜ ⎢ 1 or probability ⎢ (x, y) a correlation however, ⎡ f bivariate distribution X distribution, the 2 , σ X the is we 2 X of assessed. there and will distribution Y aid be conclude variables nor mal and to determine continuous normal X will statistic. bivariate Let but ser ves x −μ ⎟ ⎟ y σ ⎝ ⎠ ⎞ x ⎜ ⎛ y − μ ⎜ x ⎠ y ⎜ ⎤ ⎞ ⎥ ⎥ ⎟ σ ⎝ ⎟ y ⎥ ⎥ ⎠ ⎥ ⎥ ⎦ ⎦ e = 2 2πσ σ X ρ 1 Y μ y μ x y x For standardized variables, i.e. z ; z = x , = σ y 2 z 2 becomes f (z z x ) −2 ρ z + z x y 1 PDF bivariate σ x normal the y z x y 2 2 (1 ρ ) e = y 2 2π Y ou know random normal that the variable is shape that distribution has of of a a ρ 1 the normal 2D bell-shaped 3D distribution bell-shaped surface. of a The single bivariate surface. 0.15 0.1 0.05 0 –3 –3 –2 –2 –1 –1 0 0 1 1 x y 2 2 3 Y ou have variables, a 138 already then bivariate 3 seen Cov(X, normal Statistical that Y ) if = 0 X and distribution, modeling and Y are hence the two ρ = converse independent 0. is For also X and tr ue. random Y having Fur thermore, Theorem For X we and For rst ⇐, Y with par t, we now state the following. 3: independent The can if a bivariate and i.e. have only ⇒, the ρ if was normal = of seen a in joint Theorem 1 ⎢ − 2 ρ 2 (1 ⎢ 1 ) i.e. f (x, y) x − μ 2 ⎛ ⎞ ⎟ x ⎛ ⎟ ⎜ ⎠ ⎤ ⎤ ⎞ y ⎜ + ⎟ ⎝ y − μ x σ ⎝ y x − μ ⎟+ ⎜ ⎟ σ ⎝ ⎠ ⎛ ⎞ y −μ x − 2 ρ ⎜ ⎟ x ⎜ ⎠ ⎞ y ⎥ ⎥ ⎟ ⎥ ⎥ ⎜ σ ⎝ ⎟ y ⎠⎥ ⎣ ⎣ Y, ⎛ σ ⎢ function 2 ⎜ ⎢⎜ ⎢ and are density ⎡ ⎢ X Y . probability ⎡ of and 0. already denition distribution, X ⎥ ⎦ ⎦ e = 2 2πσ σ X Using ρ = 0 and the laws of ρ 1 Y exponents, 2 2 1 ⎛ μ x x ⎜ 1 (x, y) ⎜ 2πσ will x any result 2πσ ≤ X ≤ and therefore X t-Statistic For be a used linear to and for bivariate , y 2 Y are determine, correlation ≤ Y ≤ by a y 1 ) the f = joint P( x 2 (x) × f ( y) ≤ distribution X ≤ x 1 ) × of P( y 2 X and ≤ Y ≤ Y y 1 ), 2 independent. distribution at ⎠ = from dependence normal ⎟ y Y calculated x σ σ X 1 ⎜ e Y P( x ⎞ y ⎟ ⎝ × probabilities in 2 ⎠ σ X Hence, 1 ⎟ σ 2 e = μ y ⎜ ⎟ ⎝ f ⎛ 1 ⎞ given of X of level determining X and of ρ if = and Y , Y the correlation signicance, 0. We will coecient whether X use and Theorem 4 Y to can have make a this determination. Theorem If X and 4: Y have a bivariate normal n sampling distribution distribution that ρ such = 0, then the 2 has R the student’s t-distribution with (n − 2) 2 1 degrees This of means joint R freedom. that normal for any random distribution (X, Y ) sample with ρ of = n 0, independent the obser ved paired value n product moment coecient R has the proper ty that t = r data of from the the sample 2 , r 2 1 i.e. To it is distributed perform a as student’s hypothesis test t-statistic with H : with ρ = degrees 0 and H 0 of n independent pairs (x , 1 distribution ● (X, Y ) Calculate r, correlation y ), (x 1 , y 2 : of ρ r freedom ≠ 0 on a v = n − random 2. sample  ), … (x 2 , n y ) from a bivariate normal n we: the obser ved coecient value of the sample product moment R; n − 2 ● Calculate the t-statistic, t = ; r 2 1 − r ● Calculate the or calculate critical the values for the indicated level of signicance, p-value. Chapter 4 139 Example Test, at the evidence growth ρ : H  0% of rate) = 0; signicance correlation and H 0 ρ : Y ≠ (percentage = of the random Standard data from variables X and 0 Poor’s Example 2 (percentage 500 retur ns Step 1: State the Step 2: Find Step 3: Using null shows of signicant economic rate). and alter native hypotheses. 4 − 2 r t ⇒ = ≈ 0.742 2 For H , T value of the test statistic. : t (2) P(T 1.1 the 1 − 0.742 0 = 1.565 2 1 − r p whether the  n − 2 t level, between ≤ –.565) + P(T ≥ .565) = 0.258 *Unsaved 2.1 a GDC, calculate the p-value. tCdf(–∞,–1.565,2)+tCdf(1.565,∞,2) 0.258054 | Alter natively , t test’ we in the obtain using Stats the the tests ‘Linear menu of Regression the Conr m answers above using the GDC. GDC, following: A E =LinRegtT = 1 1.9 7.7 2 2.6 12.6 Linear T itle Alternate... 3.9 β & ρ R... ≠... RegEqn a+b*x t 1.56634 5 PVal 0.25777 6 df 4 3.2 9.8 2. 7 4.08578 a 8 b 2.36697 9 s 2.23119 10 1.51115 SESlope... 2 11 0.550906 r E1 =“Linear Since Reg t Test” 0.258 evidence to > 0%, reject we H do not have enough . Step 4: Reading compare with the the p-value signicance 0 Hence, there correlation X and Note: not evidence the of signicant random conclusion. variables Y. Although cor relation 140 is between is Statistical the not r value seems signicant. modeling relatively high, with so f ew data from values the the level results, and state we Exercise Use 1 the level 4C data in whether Example the 1 random to determine variables at show the a 1% signicance signicant level of correlation. Below 2 is a table Determine, evidence at of Using 3 there X supermodel, Weight At Linear the star t determined in The 68 69 70 72 74 69 64 67 64 62 73 7 determine, a of known, price. but random (weight of 04 6 2 26 7 3 24 26 27 this chapter the we graphed scatter the two positive between values Y rather of on X than can be or the X of negative. two on X are the sold. the of E(Y )|X the of a if the accurate. on joint In If x, In is other shares case, the bought of there was then to a the diagrams showed draw Y and the var y and X. and Y of Y a estimate words, value will and the of might stock the Y Y For according is line linear distribution stock, variable, sets whether used conditional par ticular independent so, could distribution this data scatter we = of whether and variables, X, price number is on pairs diagrams variables, focuses the stock the Hence, and Y if regression the amount inches) the 73 Y X between level, 72 of example, in signicance 72 line X, 10% Y 72 regression of the supermodel, regression. that at correlation of given the 66 line regression and 64 a given variables X pounds). through or signicant 7 was t, random inches. is 70 correlation best in there 70 between of the if 70 of correlation heights, level, 67 correlation a of sons’ between evidence (height Height 4.4 below , signicant variables a height data and signicance height the is fathers’ 5% correlation X-Father’s Y-Son’s of the be is to the dependent variable. Chapter 4 141 One method of the regression the least calculating line is to y nd y of the the sum of ver tical points, squares of the distances i.e. the the distances minimized, as area the x from of the is gure On (x, below on squares Y ouTube you can y) illustrates. watch Butler , Douglas creator software demonstrate least line Y the nding squares of of Autograph, on the regression X x The is sum of the obtained, Using given the by squares and the principle the of line of the least following ver tical producing distances the squares, least the Y from sum on X is each the point to regression regression line the line line of Y on is formulae: n n ⎛ ⎞ ⎛ ⎟ ⎜ ⎞ The ∑ ⎜ [( x − x )( y i − y )] i ∑ i =1 y − y ⎟ (x − x ) or y − y = ∑ ⎝ that Hence, if variables, of Y on values The − the of we E(Y X are the table mean includes ⎜ ⎟ ⎜ ⎠ ⎝ of the diagrams can )|X = of formulae ⎟ (x is beyond 2 − x ) Y on X ∑ x ⎟ 2 regression level of this course. − nx ⎟ i ⎠ i =1 line is the quotient draw x. show in This a is a line used correlation of to best t, between or estimate Y line of given the two regression, that the a Francis famous height some of Galton study , their values he rst measured children, from discussed his both initial the in the mean idea of height inches. The linear of parents, following research. y mean height of 72.5 70.5 68.5 66.5 64.5 72 parents (X ) )Y( height of 7.2 children (Y ) 69.5 68.2 67.2 65.8 nerdlihc mean y = 0.755 x + 16.8625 70 fo 68 thgieh naeM 66 64 0 x 64 66 Mean 142 Statistical these ⎟ accurate. mathematician In derivation y Var(x) scatter then X, gradient and ⎟ x ) i the Y ) regression. and (x i =1 Cov(X, − nx i n 2 ⎜ of y i ⎜ n ⎜ Notice x i =1 ⎜ = X modeling height 68 of 70 parents (X) 72 the From that the on higher entire table average, than the interestingly , the of heights he values heights of noticed of and adults that regression, adults on with with tall shor t average, Galton parents parents. adult noticed were For table More o spring Galton’ s of regression tall of full values see and http:// www.math.uah.edu/stat/ parents are shor ter than their parents, whereas adult o spring of data/Galton.html shor t parents Using strong a are GDC taller to positive than calculate their r, correlation we parents. obtain between A that the r = 0.98, variables X indicating and a Y E =TwoVar( = 2 70.5 69.5 x 3 68.5 68.2 Σx 4 66.5 67.2 Σx 5 64.5 65.8 sx 68.5 342.5 2 23501.3 := 6 σx 7 n 8 y 9 Σy 10 Σy 11 sy 12 σy 13 Σxy 14 r sn–... 3.16228 σn... 2.82843 := 5. 68.58 342.9 2 23539.8 := sn–... 2.43557 σn... 2.17844 := 23518.9 0.980272 C It is impor tant regression, within values the to can inter val outside this patter n continues As have you formulae kind be of for more is of being must which see given the not appropriate a Y ), you the line of (extrapolation), t, an For or line values of linear contained estimating assumption any that the assumption. are Var(X ), are best unknown (inter polation). there dierent and given ways Var(Y ). to work to write Depending with, some the on the formulae will others. the data nd formula the estimate valid than to to data seen, given asked that used inter val Cov(X, and are of already statistics here be information Instead we state only the ts following directly , line the of given summar y we may be regression. statistics given summar y In this case, we best. For example, if statistics: 2 n = 10; and ∑ xy asked = 906.82; to nd the ∑ x = 103.8; ∑ regression line y = 100.9; of Y on ∑ X x for 2 = 1983.08; these ∑ values, y = 1050 .67 the n ⎛ ⎜ ⎞ ∑ x y i nx y ⎟ i i =1 formula best suited is y − y = ⎜ ⎟ (x x ) n ⎜ ⎜ ⎝ 2 ∑ i =1 x 2 ⎟ nx i ⎟ ⎠ Chapter 4 143 Using the given values, the gradient of the line of regression is: n ⎛ ⎜ ⎞ ∑ x y i nx 103.8 y 906.82 − 10 × ⎟ i m = 100.9 × 10 i =1 ⎜ ⎟ 10 = = n 2 directly , − 10.09 the as data Y a car Use c Use on a the it in − 10.38) ⇒ we should example the fuel driven in y ⎠ = have −0.155 x the data + 11.7 to work with shows. on consumption a test in litres/00 km) sense to and average speed of a typical Euro nd the 0 5 20 30 35 40 45 50 3 0 9 7 6.6 6.2 6 6 line of best t, or linear regression, for table. to nd the equation of the linear equation it at an make Inter pret at to average sense 80 your estimate to the speed use the of fuel regression, 25 table consumption and check your answer of a typical Euro 4 above to nd the fuel consumption car of a car km/h? answers for this table of values. Since A passenger km/h. a B C D r ≈ −0.921, this E indicates a strong =TwoVar( = negative 3 20 9 Σx 4 30 7 Σx 5 35 6.6 sx 6 40 6.2 σx 7 45 6 n 8 50 6 y cor relation, 245. hence we can 2 8975. of := := sn–... 14.5006 σn... 13.5641 8. 7.975 9 Σy 63.8 10 Σy 11 sy 12 σy 13 Σxy 14 r 2 144 the GDC. the Does E1 4 circuit. km/h) makes the data traveling e shows when that travelling d (x however, speed variables b −0.155 consumption Justify ⎟ 10 ⎝ following below (Average (Fuel ⎜  passenger X = time, the Example The 1983.08 − 10 × ⎟ ⎠ y of ⎛ 103.8 ⎞ nx i i =1 Hence, Most ⎟ 2 x ∑ ⎜ ⎝ − 0.1551... 2 ⎜ =“Two–Variable Statistical Statistics” modeling 553. := := sn–... 2.51268 σn... 2.3504 1719. –0.9209... best t. draw a line b From ∑ xy the GDC = 1719; x above, we = 30.625; y obtain the following values: = 7.975 2 n = 8; ∑ m = 8975. x ∑ x y i nx y 1719 − 8 × 30.625 × 7.975 i = ∑ y 2 x 8975 − 8 × 30.625 x y i nx y i (x = 2 ∑ Hence, A − 0.1595 … 2 − nx i ∑ − ≈ = 2 y Hence, i y = B − x ) ⇒ y − 7.975 = − 0.1595( x − 30.625) 2 x nx i −0.60x C D + 2.9 (to 3 s.f.) E From the GDC, we see =LinRegMx(‘xvalues. ‘yvalues.1 1 10 13 T itle 2 15 10 RegEqn 3 20 9 m 4 30 7 b 5 35 6.6 6 40 6.2 7 45 6 8 50 6 Linear Regression (mx+b) that the regression line is m*x+b y –0.159575 = + −0.160x 12.9. 12.862 0.848066 –0.920905 Resid {1.7337579617839, –0.46836... 9 10 11 12 B12 c y = at a −0.596(25) constant + 2.86 speed of = 8.87 25 km/h , hence would the be fuel about consumption Use 8.9 the equation substitute f or x to and nd y. litres/00 km. d No, since 80 lies outside the data Check domain. if within e Since these urban lights, or and averages, areas congestion ‘stop-star t’ in where they there general. mode, indicate The which is driving more car causes is trac, Attempt in trac therefore in a value data reasonable explanation, that a the lies domain. assuming data is cor rect. increased usage. Example The are congested continual fuel speeds the the table  below continues for the following speeds of the same test cars on the same test circuits. X-Average Y-Fuel line Driving in km/h consumption Describe The speed and inter pret joining at the speeds consumption in is litres/00 the points allowed almost km relationship is on just about priority constant, 6 50 55 60 6 6 6 between parallel roads, i.e. the to two 65 70 6 6 75 80 6. 6. variables. the x-axis, between 50 hence and its gradient 80 km/h, the is 0. fuel litres/00 km. Chapter 4 145 Example The table  below continues for the following speeds of the same test cars on the same test circuits. X-Average Y-Fuel a speed that variables b Use a c Use the speed it in e Determine, and sense to nd the 85 90 95 05 0 5 20 6. 6.3 6.4 6.4 6.6 6.7 6.8 6.9 line of best t, or linear regression, for the nd the to equation estimate of the the fuel linear regression consumption of a line. car travelling at an average 00 km/h. Inter pret average to litres/00 km 80 table. equation of in makes the GDC d a km/h consumption Justify 0 in your answers at the speeds 0% and 50 km/h, for H table signicance fuel and this of level, consumption between 80 values. and of if there the test is a cars correlation for average between speeds the between 20 km/h. L Use a GDC Use the to nd r. =TwoVar( 4 95 6.4 Σx 5 105 6.6 sx 6 110 6.7 σx 7 115 6.8 n 8 120 6.9 y 81500. := sn–... := 14.6385 σn... 13.6931 8. 6.525 9 Σy 10 Σy 11 sy 12 σy 13 Σxy 52.2 341.12 := sn–... := 0.271241 σn... 0.253722 5247.5 14 0.989426 L Since r ≈ 0.989, correlation, b this hence H indicates we can a nd strong the positive regression line. L GDC to nd the line of best interval of the t. =LinRegMx(‘xvalues, ‘yvalues,1): 1 80 6.1 T itle 2 85 6.3 RegEqn 3 90 6.4 m 4 95 6.4 b 5 105 6.6 r 6 110 6.7 r 7 115 6.8 Resid 8 120 6.9 Linear Regression (mx+b) m*x+b 0.018333 4.69167 2 0.978964 0.989426 {–0.05833333333334, 0.049.. 9 10 11 12 I9 The c equation + = 0.083x y = 0.0833(00) 6.5 The as 4.69 at + line (to 3 of best t is: s.f.) 4.692 = 00 km/h 6.53, is hence the fuel speeds little speeds Statistical in the modeling the value table so indicate and will so fuel highway it makes trac sense consumption. that is in substitute Attempt interr uption, increase If then about litres/00 km with 146 the y consumption d of a assuming the f or x and reasonable that the evaluate y. explanation, data is cor rect. data, e For H the ρ : data = 0; H 0 : ≤ ρ X ≠ ≤ 50: Step 0 1: State the null and alter native hypothesis.  n t 0 = 2 r 8 ⇒ t = = 2 1 Step 2: Find t. Step 3: Find the 2 − 0.921 − 5.79106 2 r 1 0 921 p-value and conr m *Unsaved 1.1 using the stats test menu of the GDC. tCdf(–9.E999, –5.791, 6)+tCdf(5.791, 9.E999, 6) 0.001161 | 1/99 p = P (T ≤ 0.006 < −5.791) + P (T 0., hence we ≥ 5.791) reject = H 0.00116 , since there is Step 4: State conclusion. Step 1: State the 0 signicant random For ρ : H the evidence variables data = 0; 80 H 0 : ρ ≤ ≠ of correlation X and X ≤ = the 20: 0 null and alter native hypotheses.  n − 2 t between Y 8 − 2 ⇒ r 0.989426 = 2 16.71 Step 2: Find t. Step 3: Find the 2 1 − r 1 − 0.989426 p-value, and conr m *Unsaved 1.1 using the stats test menu of the GDC. tCdf(–9.E999, –16.71, 6)+tCdf(16.71, 9.E999, 6) 0.000003 | 1/99 p = P (T ≤ 0.000003 −16.71) + P (T < 0., hence ≥ 16.71) we reject = 0.000003 H , since there is Step 4: State conclusion. 0 signicant random evidence variables X of correlation and between the Y. Chapter 4 147 To calculate the squares of the In other the of the squares of words, conditional probability regression line horizontal the the linear distribution X on distances distances distribution of is of X X we from nd the the least points, sum i.e. the of area minimized. regression of Y, given given of Y, X on rather Y focuses than on on the the joint Y. y x (x, on y y) x The formula for the X on Y regression n ∑ (x − x )( y i − ⎞ ⎛ ⎟ ⎜ y ) i i =1 x − x = ⎞ ∑ ⎟ (y − y ) = ∑ ⎝ It is the on Y discussed 148 the Y ) important X y − gradient and to Var(Y Statistical the ⎜ ⎟ ⎜ ⎠ ⎝ of ⋅ y ⎟ ⎟ 2 (y − y ) line beginning modeling the X on ∑ y ⎟ 2 − ny ⎟ i ⎠ i =1 Y regression line is the quotient ). remember regression in ⎟ y i i =1 Cov(X, − nx i n 2 ⎜ of y i ⎜ n that x i =1 ⎜ ⎜ Notice is n ⎛ ⎜ line that pass of both the Y through this the chapter on X mean in regression point ( x , drawing line y ), scatter and as diagrams. Example A teacher given in  gave two dierent Statistics tests to his class with the following paired percentages. X-Test 65 88 83 92 50 67 00 00 73 90 83 94 Y-Test2 52 57 78 76 30 67 96 74 65 87 78 89 The teacher was absent a Justify test, b would for that and the a nd Determine two results, if like rst linear this to use test, these but results scored regression can for 52% be repor t on used the to card second estimate grades. One student, however, test. this student’s result on the rst value. there is signicant correlation at the 5% signicance level between the variables. a Since = r = 0.837, there positive cor relation, nd line is a hence strong we can =TwoVar(‘xvalues, ‘yvalues, 1): 2 4 50 30 Σx 5 67 67 sx 6 96 σx 7 74 n 83465. := sn–... := the of best t. 15.4123 σn... 14.7561 12. 8 73 65 y 9 90 87 Σy 10 83 78 Σy 11 94 89 sy 12 83 78 σy 70.75 849. 2 63693. := sn–... := 13 Σxy 14 r 15 MinX 18.1565 σn... 17.3835 72266. 0.837269 50. C12 Find E B the X on Y line by letting Y xvalues = be =LinRegMx(‘yvalue 1 65 52 2 88 57 3 92 76 m 4 50 30 b 5 67 67 r 6 100 96 r 7 100 74 Res... 8 73 65 9 90 87 10 83 78 11 94 89 12 83 78 T itle... Linear Regression... the the independent variable, and X dependent. m*x+b 0.71072 31.7999 2 E1 x = =“Linear Regression 0.707(52) test’s result 0.701019 0.837269 {–3.75732506032... (mx+b)” + 3.8 would = have 68.8, been hence about the rst Substitute f or y to nd x. 69%. Chapter 4 149 b H ρ : = 0; H 0 : ρ ≠ t = 0 Step 1: State alter native n − 2 t = the null and  hypotheses. 12 − 2 ⇒ r 0 .837 ≈ 2 4.84 2 1 − r Step 2: Find t. Step 3: Find the 1 − (0.837 ) p-value using a *Unsaved 1.1 GDC. tCdf(–9.E999, –4.84, 10)+tCdf(4.84, 9.E999, 10) 0.000681 | 1/99 p = P (T The ≤ −4.84 ) + P (T p-value 0.00068 ≥ is 4.84 ) less = 0.000681 than 5%, so we reject H , Step 4: State the conclusion 0 hence there random is a signicant variables X and correlation between the Y Alter natively B xs E D use = the stats 88 57 3 83 78 4 92 76 Alternate... β & ρ above methods, ‘LinReg t-test’ under the RegEqn a+b*x tests can see equation t 50 30 6 67 67 7 100 96 8 100 74 9 73 65 10 90 87 11 83 78 12 94 89 =“Linear Since Reg p-value PVal 0.000679 df 10. a 31.7999 b 0.71072 s 8.83862 SESlope... 0.146776 2 r 0.701019 r 0.837269 t Test” = 0.00679 < 0.05, we reject H 0 Statistical modeling of 4.8422 one 5 menu of your GDC, and ≠... you 150 the =LinRegtT 2 E1 to F ys . screen. r, t, p-value, regression and line all in Exercise 1 The 4D scores shown of below , tness. As includes a height the in in school higher the cm, a scores table, the weight in physical indicating data kg, tness gathered and age in on 62 0 45 44.9 2 94  50 42. 3 8 8 39 30.2 4 62 9 48 4.4 5 58 8 30 4.8 6 86 0 50 38.4 7 59  54 54. 8 72 40 38.6 9 54 53 52.4 0 60 20 38.6 regression of line 9 is students to by be used tting to predict S with S and one W-Weight the of students years.  0 are physical the H-Height 9 test better A-Age physical the tness variables A, H, W. By nding which the r regression b Find c Determine The the body S Justify at and A, be used H, and with S S to and W, nd the determine equation of line. the and mass Body S should equation one-year-old Hear t for variable between 2 with in S-Score scores or students shown Student A 10 of 5% the in the level of variable grams mice regression are and given signicance you the in line. chose hear t the is the correlation signicant. mass table if in mg of ten below . 30 37 38 32 36 32 33 38 44 38 36 56 50 40 55 57 43 60 70 44 the use one-year-old of linear mouse regression whose body to estimate mass is 35g, the and hear t nd mass this of a value. Chapter 4 151 3 The table below millimetres of the form correlation a State b Determine and a c Inter pret d Justify An needed, Y, 90 92 chemical is at pressure the rest from a of 10 adults consists hear t muscle between bivariate of in two contracts) beats). and Assume distribution that with 35 40 45 50 55 60 65 98 00 03 05 08 0 0 hypothesis. product moment correlation coecient for this data p-value. p-value the value to when values ρ 80 nding experiment Blood sample 30 the its pressure hear t 25 the diastolic 4 random suitable state the 20 Y-Diastolic a Hg. (pressure when coecient X-Systolic blood mm systolic (pressure values the mercur y , measurements: diastolic shows is of in line an the of best adult performed dissolve compound. a context who ve cer tain The t of of the Y has on a times X, and systolic by substance summar y problem. use value measuring that statistics it to of 138 mm the weighs X estimate the Hg. temperatures grams in a are: 2 ∑ x = 182; Find the ∑ y = 200; regression substance, x grams, ∑ line y = 9850; of that X will on ∑ Y, xy and dissolve = 8390 use in a it to litre nd of the the weight chemical of at the 90 degrees. 5 The summar y statistics for a given data set 2 n = 10; Find ∑ the x = 30; 1 It with the 220; the y ∑ = 86; regression ∑ y = 1588; ∑ lines a Y on determined speed of the model. that car, The Y the (in temperature km/h), summar y in a = 8; Find of b ∑ x = the 440; car speeds driving Statistical x xy and = 580 b X on Y 28, 400; the data regression at an modeling ∑ y coecient, correlation in car are tires, X to given a (in °C), varies linear as: 2 = correlation appropriate 152 ∑ signicant The X according statistics 2 n follows: QUESTIONS been regression = of as exercise STYLE has x equations Review EXAM ∑ is 2 at var y line, average the = 606; and 5% from of y = determine 49, 278; if ∑ there is xy = 37,000 evidence level. 20 km/h estimate speed ∑ 55 the to tire km/h. 90 km/h. Using temperature of a an 2 If Var(X ) = determine 3 Find a 4 the and Var(Y ) largest smallest positive The 15 the r value correlation independent = 7 for possible at that the random two random covariance will 10% give of X signicant signicance variables variables and X, Y, Z and evidence level and X Y, Y. when n each of = have 20. a mean 2 of 0 and a variance determine if the the value 5 The data st in c d ten time time and cer tain at two b + Y, V = following the V dr ug + Z, and variables variables and X in each W and = X − Z, determine case are W absorbed dierent by times, the and adult the body was following 20. 55.0 39.6 24. 3.2 9.5 22.3 35. 4.2 . 8.7 52.3 4.2 26.5 29.3 .2 25. 33.2 the its your of An 11th product p-value the patient was rate of absor ption rate the next for rates to time, this the Show mutually Show a correlation coecient for this data, two if = and if of it 0, 1st be time. 25 the problem is two have found times a the 2nd Predict children and was what been the 0.532. times nd the for this dr ug Y a the same 1% in was on had a dr ug patient. correlation Use lines, but the correlation the regression time, underwent that positive dierent the of tested would was there two to time dierent the r 2nd context line. the group determine that the unable 19.8 dr ug, between in regression absor ption The moment p-value. equation level b a the that X 0.3 Inter pret between a of V = 44.5 state tests 6 and U of implies patients Determine and b U If recorded. 2nd a a percentage tested . correlation found independent: σ of absor ption coecient signicance the absor ption administered. X and X on Y are per pendicular. that if r = ±1, the regression lines of Y on X and X on Y are identical. c Given that the equation of the Cov ( X , Y y = a + bx, where b regression of Y on X is ) , = line and the regression line of Var ( X ) Cov ( X , Y X on Y is x = c + dy, where d , Var (Y if d If b and the and nd are least the r, d positive, squares least the both product and r regression squares ) = line regression correlation =  of line if bd Y of show that r = + bd ) on X b X on and is Y d are given is given both by by y x = = negative. 12 + 0.19x, −4.4 + 0.77y, coecient. Chapter 4 153 Chapter Denition:  The summary obser ved value r of the sample linear correlation coecient is dened as n ∑ (x − x )( y i y ) − i r d ∑ i =1 = d x y = n 2 n 2 (x ∑ x ) − ∑ i i =1 The d ∑ 2 ( y 2 ∑ x d y y ) − i i =1 sample product moment cor relation coecient R, for n paired obser vations (x, y) n (X ∑ − X )(Y i − Y ) i i =1 on X and Y, is R = n n 2 ∑ (X − X ) 2 ∑ i i =1 (Y − Y ) i i =1 n ∑ X Y i nX Y i i =1 An alter native formula for R is R = n 2 ∑ 2 (X − n X 2 2 )(Y i − nY ) i i =1 We can classify by using  ±  0.5  –  0.  –0.5 < R ≤ –0.  –0. < R < 0. The the the following indicates ≤ ≤ R R ≤ < ≤ R  a < 0.5 population of general perfect indicates –0.5 E[( X ρ strength a indicates a a indicates positive strong weak a between the random variables X and Y classication: strong indicates − E [ X ])(Y correlation positive/negative indicates product the a correlation. negative positive weak highly moment correlation. correlation. correlation. negative weak correlation. correlation, coecient or no correlation. is − E[ Y ])] = 2 E [( X − E [ X ]) 2 ] E[ Y − E [ Y ]) ] or Cov ( X , Y ρ Var [( X ) Denition: X ) = and of the Y of two Cov(X, Var (Y For a ) each joint = E[(X of n paired distribution, variables’ Y ) set – Cov(X, deviations E(X ))(Y – obser vations, from Y ), their E(Y ))] = is a = E(X ), x or, = μ E(Y ) E[(X y Cov(X, Y ) = E(XY ) – μ x 154 Statistical modeling y covariance measure respective of the means, )(Y – x where the )], y of the mean i.e. random value of variables the product Properties of  Symmetr y: 2 For two covariance Cov(X, Y ) independent = Cov(Y, events Theorem : If X and Y are Theorem 2: If X and Y have Theorem 3: For if If and X only and Y ρ if X = have and and and Y, Cov(X, independent a with linear a Var(X ) Y ) variables, relationship, bivariate normal = Cov(X, = 0. ρ then i.e. Y = = X ) 0. mX + c, distribution, X ρ then and Y = are ±. independent 0. a bivariate n distribution Y X X ), normal distribution that ρ such = 0, then the sampling 2 has R the student’s t-distribution with (n – 2) degrees of freedom. 2 1 Using the R principle of least squares, the Y on X regression line is given by the following formulae: n n ⎛ ⎜ ⎞ ∑ [( x − x )( y i − ⎛ y )] ⎟ i ⎜ i =1 y − y = ⎞ ⎟ (x − x ) or y − y = 2 ∑ ⎝ (x − ⎟ ⎜ ⎟ ⎟ ⎜ ⎠ ⎝ ∑ of the Y on X regression line for the X on Y regression n (x − x )( y i − ⎞ ⎛ ⎟ ⎜ y ) i i =1 − x = ⎟ (y − y ) = line − nx ⎟ i ⎠ (X ) ) is: ⎞ ∑ x y i − nx ⋅ y ⎟ i ⎟ ∑ ( y − ⎟ ⎜ ⎟ ⎜ ⎠ ⎝ y ) i i =1 gradient (y − y ) n 2 2 ∑ ( y of the X on Y regression ⎟ 2 − ny ) ⎟ i ⎠ i =1 Cov The x ( X ,Y ⎜ n ⎜ ⎝ x ) i =1 ⎜ ⎜ − n ⎛ x (x ⎟ is Var formula 2 i =1 Cov ∑ ⎟ 2 x ) i i =1 gradient ⎜ y )] n ⎜ The − nx i ⎜ n The y i i =1 ⎜ ⎜ x ∑ line ( X ,Y is ) . Var (Y ) Chapter 4 155 +∞ Answers Chapter 6 a The function b The modal 1 Mode (X ) Exercise = −1, 1 b Mode (X ) = m = 1 Median, µ 0.95 = m = σ a = σ .72 Mode (X ) Median, (x)dx = 1 = m 0 = b = doesn’t exist. B a P(X = 2) = 0.24 b P(X = 3) = c P(X = 4) = 0.0625 d P(X = 5) = 0.104 0.000182 a P(X ≤ 4) = 0.684 b P(X > 6) = 0.000729 c P(5 d P(1< 2  Mode (X ) 0.695 value 2 2 2 f 1 1 Median, = if ∞ check µ well-dened  Ski lls a is Median, = m ≤ X ≤ 7) = 0.158 X ≤ 7) = 0.009 0 = 3 P(X 4 a P(X ≤ 3) = = 5) 0.980 = 0.0000377 b P(X ≥ 4) = 0.000512 5 a P(X = 4) = 0.0921 b P(X ≥ 7) = 0.377 0 3 µ = σ = µ = 0 σ = 0.342 4 0.487 Investigation c Mode (X ) = 3 a Median, m µ = 4.6 σ = 0.839 = 0.36 b 0.0081 c 0.409 4 Exercise 1 For C 1B, Question 1: 2 3 a b 2 + 2 2 a E(X ) = 1.67, Var(X ) = 1.11 b E(X ) = 7.14, Var(X ) = 43.9 c E(X ) = 2, d E(X ) = 1.14, 3 1 4 a f = ′(x) , x ≠ 2; 2 (2 f (x)dx x ) = −ln(2 − x) + c, x 1 f ′(x) = 3e ; f (x)dx = 2 For 3 x +1 3x + 1 b < e + B, Var(X ) = 2 Var(X ) Question = 0.155 2: c 3 ⎛ π c f = ′(x) − cos 2x ⎜ ⎟ 3 ⎝ f (x)dx ; = cos 4 E(X ) = 4, b E(X ) = 1.43, Var(X ) c E(X ) = 3.33, d E(X ) = 1.01, a E(X ) = 1.37 b Mario c 6 = 12 Var(X ) = 0.612 ⎠ ⎛ π 9 a ⎞ 2x = 7.78 ⎞ ⎜ ⎟ 3 ⎝ Var(X ) + c Var(X ) = 0.00916 ⎠ 2 5 x 2 d f ′(x) = 4x(x − 2); f ⎪ k = 3 = F (x ) b x + 7x = four shots to destroy the balloon. x = 0, 1 , D 1 a P(X = 2) = 0.16 b P(X 1 , x = 4) = c P(X = 9) = 0.235 d P(X = 32) 2 a P(X ≤ 4) = 0.508 b P(X > 6) c P(5 ≤ 3 a p 0.4 4 a X 0.188 2 = 0.089 > 2 = 0.109 10 0, ⎧ ⎪ ⎪ b < 0 24 ⎩ a make students Exercise x + 6 , ⎨ ⎪ a must 2 ⎪ 2 c A ⎪ a 4x 3 0, ⎧ 1 3 x 5 Exercise 4 (x)dx F (x ) = < x X ≤ 7) = 0.525 d P(8 < X ≤ 11) = 0.326 1 = b P(3 ≤ X ≤ 5) = 0.503 2 9x x , ⎨ = 1 , x 2, 3, 4 ~ ⎜ 2, NB b ⎝ ⎪ 1 , ⎩ > x 47 3 1 ⎛ 20 c ⎪ 128 32 4 4 5 a 0.313 6 a 0.264 b b 0.999999 0.0473 7 c P(X ≤ 2) = 10 ⎧ x 3 a P( X = x) = , x = 0, 1 , need 2 7 0, ⎩ Mode = P( X = x) = a 0.0829 Exercise ⎨ so , x = 1 , 3, 5, 7, ⎩ 0, 1 9 b 25 ⎪ b it is almost more cer tain than a that dozen he will students. 0.589 E 1 x ⎪ a 1, inter view otherwise 2 ⎧ 4 to 6 ⎨ ⎪ b ≈ + 1 ⎪ m = G (t ) 1 = 3 t + 16 7 1 2 t + 4 1 3 t + 8 4 t + 4 16 otherwise x 0  2 3 4 i 5 a b = 1 0, ⎧ x 1 4 6 4 1 16 16 16 16 16 p < 0 i ⎪ π b F (x ) = ⎪ 2 2 cos x , 0 ≤ x ⎨ ≤ 3 2 ⎪ x π ⎪ 1 , x  2 3 k … k 3 ⎩ ⎛ 1 c P ( X ≥ ) 6 156 Answers 5 25 p π = 0.732 … i > … i 6 36 216 − 1 k −1 5 ⎞ ⎜ ⎟ ⎝ 6 ⎠ 1 5 = × k 6  not one one 6 … not 2 3 a P(X = 0) 8 = b P(X ≤ 1) 9 1 c P(X ≥ 3) 5 a E(2X 6 Var(5X 7 E(3X − 3) = 45 b Var(2X − 11) = 192 = 3 + 3) = 90 1 = d + 2) = 6 2 + 2, Var(3X + 2) = 3 k 27 3 Exercise 4 a E(X ) = b E(X ) = p, Var(X ) = p(1− r Var(X ) = Question E(X ) = Question Question Var(X ) = 1 E(X ) = 2 6, Var(X ) = 30 3 a E(X b E(2Y = , Var(X ) −Z ) c E(2Z −7X ) d E(X −Y + Z ) = e E(X +Y − Z ) = f E(3Z − Var(3Z 2 P(X = 2, = Var(X 22, + Y ) = 3, Var(2Y Var(2Z 20, − = 1.9 − Var(X 14, Z ) = 8.4 7X ) = 35.7 −Y Var(X + +Y Z ) − = Z ) 4.7 = 2X + 4Y ) = 4.7 10 4 4t 4 a Y ) = 2 6 + 3 1 E(X ) 1 2 p 1 2, B pq] rq , p 5 p)[= = 1) = b G (t) + 2 7 7 c The expected d The maximum number of shots is − 2X + 4Y ) = 49.6 2t = 2 a 3 Var(2X E(3X 4 Var(X + 5Y ) = 31 b Var(11Y − 7X ) = 703 t − 3Y ) = 72 72p 2. 20 number of shots is 5. − Y ) = 20 p Exercise F Exercise C 1 = 2 2 ⎛ 2 + 7t 1 a G (t) X + = ⎞ + 3t ⎜ Y ⎟ ⎝ 12 2 b P(X + Y ≤ 1) a X x 0  1 1 2 2 ⎠ 13 = c E(X + Y ) = P{X 9 = x} 6 3 2 ⎛ 2 a G (t) X = ⎞ t ⎜ 1 ⎟ 2 1 +Y ⎝ 6 − 7t + 2t E(X ) ⎠ = , Var(X ) = 2 b E(X + Y ) = 15 c Var(X + Y ) = 24 b 0.25(t−1) 3 a G (t) b 0.989 = e 0.15(t−1) , G X (t) = e 4 Since we have 6 independent G Y (t) = e coin, we add six instances ⎜ qt Review 2 ⎠ a a exercise E(Y ) = P(1 ≤ X ≤ X = 3, Var(Y x P{X 4) b P(X ≥ 2) ) = d 0 = ∈ [ 0.67, 6.67]  2 1 3 3 2 1 25 E(X ) = , Var(X ) = 9 3 c E(X ) 5 = d Var(X ) = 4 a G (t) = b 16 0.6(t – 1 e 0.12(t − 1), ), G X (t) = e 0.28(t – 1) G Y (t) = Since die, e we we have add 4 independent four instances b 1 a i c ii 0.264 0.105 c iii 0.0625 iv Eight a = 0 2 f b (x ) = ⎨ = ≤ z c 0 ⎩ E(X ) x ≤ 2 ⎪ b E(Y ) = a E(X + , b E(3X c Var(X X ) Var(Y ) = X ) = 9, Var(X 9, Var(3X ) = + X + X ) = 12 36 2 + X + X + 3X ) = 48 a a E(X + ) X = 144 + X + X + X ) = 10  Var(X + X + b E(Y Y + Y c E(X X + X + X ) = 5 check P(X = 2) = 0.311 = − 1 b P(1 ≤ X ≤ 3) = 0.786 , a = − , 2 2 + + X Var(X 1 1 a X otherwise 4 Chapter a same 9 + = Var(6X Ski lls the ⎧ x ⎪ a of variable 0.922 3 b rolls the 8 3 0.0117 of Z 4 2 y x} = 5 1 X 1 = 625 5 same 2 624 4 the ⎟ 1 ⎝ 3 of variable 3 c 2 the Z r r 1 of pt ⎛ 4 ips 0.05(t−1) , a = + + X ) X + = + X 15 X + + X Var(Y X + + X Y + + Y + Y + Y + Y + Y + Y ) Y = ) ) = 9 25 = 14 2 3 3 5 a X = x  2 3 4 i Exercise 1 a A E(3X ) = 15.9, Var(3X ) = P{X 10.8 = 1 1 1 1 4 4 4 4 x } i b E(X c E(4X + + 3) 1) = d E(2X − 5) e E(kX + 2 a E(3X + 3 E(2Y 4 a 8.3, Var(X + 3) = 1.2 5 = 22.2, Var(4X + 1) = 19.6 E(X ) = 2 = 5.6, Var(2X − 5) = 4.8 2 p) = 2) = 5.3k + p, 14 Var(kX b + p) Var(3X − = 1.2k 2) = Y = E(3 1) − = 2, 2Y ) Var(2Y = 1 − 1) b =  3 1 1 1 2 3 6 21.6 Y = y } i 3 Var(3 2 i P{ − y − 2Y ) = 8 5 E(Y ) = 3 Answers 157 b Z = z  2 3 4 6 1 5 8 24 8 9 2 1 5 1 1 1 1 6 24 8 12 24 24 Chapter  i P{Z Ski lls check z } = i 25 E(Z ) µ = x 2 a P(X 3 a P(Y = σ 3.62, = = 5) = 2.82 0.03125 b P(3 ≤ X < 8) = 0.322 b P(3 ≤ X < 8) = 0.569 = 6 6 2 1 = 0) = 0.670 2 Exercise Exercise A D 2 1 1 a P(Y − Z −W < 0) = a x P(X c P(3X + Y + + Y Z > + Z W + d P(X − 3Z ≤ 2Y + e P(X − 4Z ≤ 3X − 2Y + = W ) 2Z 3W = = ) = 33 x b = = 29.75, s = x 67, 0.892 s = = 25.3 1518 2 2 x 3 a = 0.65, s = 0.864 2 x = 33.7 b s = 23.8 b [ b [0.104, b n 0.671 = ) 2 s 0.5 Exercise 0.303 f P(W a 0.551 b 0.162 3 a 0.159 b 0.201 c 0.564 4 a 0.0186 b 0.0000155 c 0.149 d 0.118 1 ≤ 0) ) 2 Exercise −Y > W 11, 2 c b = 0.5 0.994 1 2 B a [4.19, c [2819, a [3.90, c [320, 3 n 4 [72.3 g, = 5 a 5.81] 12.8, 9.20] 2889] 6.10] 0.167] 325] 30 77.2 g] E a P(1 b P(2 ≤ X ≤ 3) = x = 15.8 = 10 0.777 Investigation 2 0.053 3 a ≤ ≤ X 8) = 0.992 c P( X ≥ 0.8) = 0.639 a P( X ≥ 37) = iii 0.228 b b P(X + X + X + X + i X > 180) = At [95.0, 105.0] ii [96.9, [97.8, 102.2] iv [98.7, the a ≤ P( X 70) = 0.920 b P(T ≤ 800) = we P(1.5 b P(1.25 c P( X d P( X e P ≤ ≤ X ≤ ≤ X ≥ 2.5) < 397) = 0.834 3 a take, a = = the the condence sample inter val 1 a [14.45, c [3430, 2 a [1.94, c [319.6, a [67.6 b [ 25.81, b [0.112, 20.19] 3526] 0.421 3 ) 15.55] 0.923 0.0353 < X = 8.06] 0.159] 324.9] g, 82.5 g] 0.0983 2 X < < 2.2) 0.193 b 0.579 = 4 a 5 a [483.4, c For x = 13.35 0.00469 b b The condence level is 90%. 0.280 c the 588.6] same b set of [463.0, data, the 609.0] higher the signicant 0.917 the wider the condence inter val we get. 62 Exercise Review narrower C level 4 the 0.639 1 < 2 P(1.8 = 1.35) 0.48) 1 ( 2 101.3] larger F a f the get. Exercise 1 level, 0.782 we Exercise signicance 0.355 size 4 same 103.1] D exercise 1 1 0.796 2 b P(X d E(Z ) ≥ 6) = = 0.384 c P(X + Y < 5) = a [ 2.18, 1.98] c [ 0.0609, b [ 13.73, 7.44] 0.0859] 0.173 2 7 a d = Bob−Rick 3 −8 −3 9 6 −9 3 2 −4 −3 3 −7 i Var(Z ) = 77 b e The random variable Z has no [−4.27, Exercise distribution 3 a 0.00621 c N(720, d 0.299 since 7 = E(Z ) b ≠ 4.60] Poisson Var(Z ) = 1 0.0787 E 77 a Since the p-value is 0.000008 < 0.1 we reject the 2 2 4 a 20 or 5 a 0.377 20 2 + 12 ) = N(720, (4 34 ) null ) b 47 b 16 b 0.196 Answers Since the sucient c the 45 c 158 hypothesis 5% Since at p-value the is evidence p-value is > the 0.05 null level. we have no hypothesis level. evidence at signicance 1% reject 0.010566 sucient the signicance 0.095581 to signicance the 10% to reject level. > the 0.01 null we have no hypothesis at 2 a Since no the p-value sucient hypothesis b Since have the no Since the 3 a Since the at is the is p-value reject 0.029673 1% p-value > we the > to 0.005283 the we b the c we level. reject 4 H p-value the null 0.743755 evidence 10% Since reject we the Since null level. 0.05 the sucient the 0.01 < Since at signicance 0.036819 a null 0.01 < 1% 3 have level. reject signicance at is 0.1 signicance evidence hypothesis the to 10% p-value the null 0.131776 evidence at sucient hypothesis c is to the p-value is p-value evidence signicance : “The mean the to volume we have null no hypothesis level. the < 5% 0.115078 at 1% 0.1 0.004763 at sucient the reject signicance hypothesis > > reject 0.05 we reject signicance 0.01 the we null level. have no hypothesis level. is ( µ 120 ml.” = 120) 0 the null hypothesis at the 5% signicance level. H : “The mean volume is not ( µ 120 ml.” ≠ 120) 1 b Since the p-value sucient at c the Since the 4 a H 10% the null : is evidence reject signicance p-value is hypothesis “The 0.109391 to mean > we null have no Since hypothesis 1% the weight is < 1% 0.01 we reject signicance ( µ 26 g.” = : 0.283654 signicance a to level correct > reject and 0.01 the we null conclude volume of a have no hypothesis that the at the factor y par ticular ice-cream product. 5 H 0 H p-value evidence adver tised level. 26) the sucient level. 0.000153 at 0.1 the : “The mean life expectancy is 30,000 hours.” 0 “The mean weight is not ( µ 26 g.” ≠ ( µ 26) = 30, 000) 1 b Use the z-test. Since the p-value is 0.00001 H : “The mean life expectancy is less than 30,000 1 < 0.01 1% we reject signicance har vested 5 a H : the level snails “The mean null are hypothesis and not level conclude from of fat in at the that ( µ hours.” the Since the population. hypothesis the drink that the 30,000) p-value the is < at 0.094543 the 10% manufacturer < 0.1 we signicance claims a reject level longer the and life null conclude expectancy 0 ( µ 1.4 g.” H : “The = mean of 1.4) level of fat in the drink the LED lamps. is 1 Exercise more than 1.4 g.” ( µ > 1 b Use the z-test. Since the p-value is 0.335687 H > we have no sucient evidence to reject : “There hypothesis at the 5% signicance level the “There conclude that H mean : “The the company volume of claim juice is in a dierence dierence in in nishing nishing ( µ times.” ( µ times.” = ≠ 0) 0) d the t-test. Since the p-value 0.874063 > 0.05 we and no sucient evidence to reject the null hypothesis correct. at a is 1 have 6 no d : Use null is 0 H 0.05 G 1.4) the the 5% signicance level and conclude that there bottle 0 is is 300 ( µ ml.” = no dierence in nishing times on the two Rubik’s 300) Cubes. H : “The mean volume of juice in the bottle 1 2 is less than 300 ml.” ( µ < H 300) b Use the z-test. Since the p-value is : “There 0.1 we reject the null hypothesis 0.004612 “There at the level and conclude that the less volume than dierence dierence in in the the ( µ scores.” ( µ scores.” t-test. Since the p-value 0.085622 < 0.1 reject the and null hypothesis conclude that at players the 10% score a signicance better result stated. using the new type of dar t. F H : “There is no dierence in the weights.” ( µ 0 a 0) 0) bottles 3 1 = ≠ d the when Exercise a 1 level contain is 10% we signicance no d : Use < is 0 H Since the p-value 0.6382 > 0.05 we have no H : = 0) d “Students who join the programme drop some 1 sucient evidence to reject the null weight.”( µ hypothesis > 0) d at b the Since 5% the sucient at c the Since signicance p-value 0.10858 evidence 10% the level. to reject signicance p-value > Use 0.1 the we have null no have hypothesis at level. 0.013122 > we have no the no 0.01 the evidence to at signicance reject the null 2 a Since null b 1% the p-value hypothesis Since the the hypothesis 5% < 0.05 we signicance 0.007494 at the < 0.01 we reject Since the sucient p-value 1% signicance 0.225675 evidence to the 10% and before the > conclude and after 0.05 null that the we hypothesis there is programme. H a 0.0455 2 a E(X ) 3 a 0.133 b reject > 0.1 the we null = 0.773 signicance 25; α = 0.00130 b b 0.978 0.972 level. reject the exercise level. have a [602, c We 702] b notice that [586, 718] no a higher signicance level hypothesis means at weight 0.121576 reject the 1 c in level to hypothesis Review null signicance dierence p-value level. 0.027947 at p-value the evidence no 1 the Since sucient 5% Exercise sucient t-test. a wider condence inter val. level. Answers 159 3 2 a Dierence 2 9 6 8 5 4 5 8 7 r = –0.970; Exercise b H : “There is no dierence in there is a strong negative correlation 5 C potassium 0 (µ levels.” = 1 0) p = 0.00229. Since p < 0.01, we reject the null d H : “There is a ≠ 0) dierence in hypothesis. potassium There is evidence of signicant 1 (µ levels.” correlation between the two variables at the d Since the p-value 0.46297 sucient evidence at signicance the that the 1% there two is no types to reject level dierence of > 0.01 the null and in we we have hypothesis 2 p = level. 0.394. evidence conclude measurement biochemical 1% no is of not the analyzers. Since to reject evidence two p 0.05, the of variables > there null the not enough hypothesis, signicant at is 5% i.e. correlation there between level. 2 3 a x = 5.33 b [3.72, c In s = 3 4.38 = 0.0343. 6.94] of the sample of 15 cases, the mean obser vations will fall value taken within the of from correlation a 10% the [3.72, 5 a x = a 30 s c H : 51.2 b is 0.1, we reject evidence between the two of the null signicant variables at the condence D a Use W, because it has the strongest correlation, 95% average r 2 x b “The There < 6.94]. 1 4 p level. Exercise inter val Since hypothesis. 99% population p = 30.97 time is s = –0.546 0.298 ( µ 30 s.” = = b S c p = –1.07W + 114 30) 0 H : “The average time is more than = 0.102. Since p > 0.05, there is no evidence 30 s.” 1 to ( µ > reject the evidence d We null hypothesis, i.e. there is not 30) use t-test since the standard deviation of signicant correlation between is the two variables at the 5% level. unknown. Since the the null p-value 0.000163 hypothesis average time is and more < 0.05 conclude than 30 s, we 2 r = 3 a H b r sets up the the therefore speedometers = to : ρ = 0; a x = 7 a i b i c 8 n = ρ ≠ 0 : 1 p = 0.000008 There is ver y strong evidence to indicate a positive association between the random speed. 15.37, x = thus the sample 23.5 [21.8, 25.2] 24.9] ⊂ [21.8, condence inter val condence inter val. a i ]– ∞, b i 0.569 size = should ii s ii [22.0, 25.2], is a be y = X 4 x = 0.6y and Y the so a subset 5 a Y b X ]– ∞, ii 0.705 of a + 12.4; 66.4 on X: on Y y : = x 2.48x = from 10% to 5%, a Chapter II error increases is approximately 97. 29.6 exercise 8.96[ Type the from 1.17; – 95% a I r = 0.975, p = sucient 0.000039; evidence of p a < 0.05, strong hence there positive error probability 0.569 to between the two random variables. of b Type + 0.380y relationship decreases y grams 90% of ii probability 10.8; 24.9] is When + 10.5 Review 9.19[ 0.623x 16. 1 a 150 mg 210 [22.0, c 82.8; the show d b + H 0.962; variables 6 1.91x 0 = a higher y reject that c company 0.756; 75.8 °C 0.705.  2 10.2 3 r 4 a = 0.299 1 Ski lls check , no b 0, no 2 –8 1 2 a 2E(Z ) b 4Var(Z c E(X a x = a 3E(Y ) – )E(Y ) 0.382 = ) + 9Var(Y 2E(X ) ) + 5 4Var(X ) a r b The )E(Z ) y 43.5, Var(X 3 – = 0.991; –31.75, Var(Y 0.951 c y ) = 98.69 0.938 d 0.732 = c 20 d p = p 0.905x 6 1 r = –0.382; 2 r = 0.794; 160 Answers there there is is a a weak strong negative positive correlation correlation d r = + 0.00620; A 0.383 2.78x10 suggests the correlation Exercise = p-value between 46.25, b = two a strong random relationship variables; 2.59 p < 0.01, between hence the there random is a strong variables. Index Page numbers in italics indicate the Answers section. negative A correlation 125–6 G no actuarial science correlation geometric 5 positive alter native 126 hypotheses correlation distribution sample linear correlation expected coecient 128, weak 26, and strong value and variance 154 of distribution 27, 38 B Ber noulli 12–13, 124–5 100–1 correlation geometric random 128, 38 variables 18–20, 38 131 Ber noulli trials 12, 13, 21, 23 Investigation covariance Ber noulli, Jacob 135–6 13 negative denition Bessel, Friedrich 136, binomial 154 68 distribution proper ties binomial 18 distribution 26, 29, of covariance 20–3 136–7, 38 geometric random variables 18 155 bivariate distributions 122, 123–4, Gosset, critical value William Sealy 91 101 154–5 cumulative bivariate distributive function 5, H normal 38 distribution Halley , 138–9 discrete correlation and continuous quantities correlation and covariance causation Edmund hypothesis 124–8 5–6 bivariate 128–30 100–1, 120 normal distribution 135–6 5 testing 138–9 D hypothesis testing four 138–9 D’Alember t, linear regression of testing covariance exercise 152–3, Moivre, for and Y of dependence freedom is known μ when 120 90 variables 2, is testing unknown for μ when 105–8, 120 5–6 signicance James for 101–4, 68 σ random 120 testing hypothesis Gustav 139–40 Auguste 101, 68 of Dodson, Bravais, 37, 130–4 discrete X 5, σ Dirichlet, t-statistic Abraham 162 distributions hypothesis 85 hypothesis degree sampling Charles 136–7 de review in 25 141–50 Darwin, proper ties Jean steps testing for matched 5 pairs 133 109–11, 121 E I C empirical Cauchy , Augustin r ule independent 68 Equitable Central Limit 67 Theorem 41, Life variables estimates 80, independent estimators condence inter vals for 79, 84, 39, 80, and Chebyshev , estimators Pafnuty variance for 80–1 the mean L of a normal Laplace, Pierre-Simon Law of Large 68 Numbers level of signicance 41 68 random variable 81 85, 101 100 well-dened condence inter vals for mean denitions algebra condence inter val for combination 97–9, Lévy , Maurice Lindeberg, Jarl 68 Waldemar condence inter val for μ 85–9, inter val for μ 90–5, 68 independent random normal random variables linear regression 141–4, 58 linear transformation 148 of a single 42–6 variable 42–4, of a single 76 transformation of two or linear transformation of two or 120 variables 47–57 more variables 47–50, 76 90 review value exercises 74–5, 159 Ludo 12–16 85 sampling random variables 3, distribution of the Lyapunov , Aleksandr 68 6 mean correlation normal 58–63 transformation more continuous of when linear unknown investigation combination 120 variable condence linear when linear known condence of 120 variables is 76 matched independent is 42, 84–5 linear pairs 82–3 the expectation σ 81 estimates 71 and σ 32, 76 71 unbiased website 76 variables 91 estimators Theorem random 120 the denitions mean random 58–63, 120 77 claims normal 5 68–72, 64–7 124–8 M correlation and causation correlation coecient 128–30 F Maclaurin’s formula 27 126–7, Fisher, Sir Ronald Aylmer 85, 87 Markov , Andrei Andreyevitch 68 133–4 Index 161 matched pairs signicance pairs moment probability 97 testing 109–11, for review 121 and hypothesis 25–35 exercise probability 50 estimators generating functions matched 36–7, generating making 157 functions 25, 38 estimates testing sense of review exercise Type and I data 79 117–19, Type 80–1 100–11 II 161 errors N proper ties negative binomial 20–2, Neyman, 27, distribution sum 38 Jerzy of 112–16 39 unbiased independent variables and 32–4 87 estimators variance random of for a variable the mean normal 81 Q normal null random variables hypotheses failure to Type and 81 well-dened 100–1 reject qualitative and quantitative data 5 sum 101 of denitions independent 82–3 variables 32–4 R I Type II errors 112, 121 random samples random variables T 64–5 40–1 t-distribution 91–3 O estimators one-tail tests lower) normal (upper 100, Poisson and 120, regression 121 distribution graph for 80–1 t-statistics distribution analysis 53–4 41, 101 t-statistic 141–50 and a trac Y for dependence of X 139–40 analysis 35 S one-tail test 115 two-tail sampling distribution of tests 100, 120, 121 the normal distribution graph for a P mean 64–5, 76 two-tail p-value sampling 101 distributions Blaise sample 20 product I and distribution correlation 20 Karl 87, 133 R 130–1, cur ves signicance 72 distribution 27, 29, 38 skills levels 85, random Pólya, variables Denis George Pólya’s probability 20, 68 distribution population 64–5, population sample probability 77, 4, 40, 100 distribution analysis two-tail 2–3, methods modeling error statistical 65, analysis 122, test 78, 161 76 p-value von methods value value function geometric 162 Index condence 5–12 distribution 12–24 for mean inter vals 84–99 for 101 Mises, Richard 78, the z-distribution z-statistics 85 101 120–1 distribution graph 116 80 condence Z cumulative for V critical statistical standard 38–9 graph 115 157 159 79 73 5, algebra distributions values statistical 20 121 101 156 parameters size expectation 53–4 68 test checks a Poisson, 112, 112 distribution one-tail normal Poisson errors 154 a percentile II research coecient normal Pearson, Type moment medical Pascal’s 116 130–4 Type Pascal, test 101 92 Edler 68 M AT H E M AT I C S LE V E L : HI GH E R S TAT I S T I C S The most comprehensive and accurate coverage of the Statistics Option for HL , with unrivalled suppor t straight from the IB. Offering a rigourous approach and suppor ted by a Authors Josip Harcet full set of worked solutions online , this book will fully Lorraine Heinrichs challenge learners to drive top achievement. Palmira Mariz Seiler O xford course books are the only DP resources developed with the IB. Marlene Torres Skoumal This means that they are: ➜ The most accurate match to IB specifications ➜ Written by exper t and experienced IB examiners and teachers ➜ Packed with accurate ➜ Truly aligned with the IB philosophy assessment suppor t, directly from the IB Free suppor t material online at www.oxfordsecondary.co.uk/ibmathhl Extensive challenge material thoroughly stretches learners, suppor ting the highest levels of comprehension Examples and investigations help to put complex theory into practice. Also available 978 0 19 839012 1 978 0 19 839011 4 How to web  get in contact: www.oxfordsecondary.co.uk/ib email schools.enquiries.uk@oup.com tel +44 (0)1536 452620 fax +44 (0)1865 313472

IB Statistics Higher Level Course Companion

Related documents

Products

Support

IB Statistics Higher Level Course Companion

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib