Math 5900 Final Project The goal of this final project is to review the concepts that we discussed in the course as well as to assess your level of understanding. The project consists of 9 problems, many of which have multiple parts. Although you are allowed to discuss the problems with your peers, you are required to produce your own solutions to each problem. Several problems will require you to use R. In these instances your answer should consist of the code you used to solve the problem as well as any output generated by R, including pictures, and should be organized using a document processing software (like Word, LaTex, etc.). I will provide some instruction on R related questions if asked, however on the theoretical questions I will only reveal whether or not a proposed answer is correct. The project is due July 22nd by 8pm in my math department mailbox in the JWB faculty lounge or slid under my office door at JWB 214. You may also email your project to me as an attachment. Theoretical Problems 1. Suppose 2 fair six sided die are rolled. (a) Let X denote the sum of the die rolls. Compute the probability mass function of X. (b) Compute E(X) and V ar(X). (c) Let A be the event that the sum of the die rolls is even, and let B be the event that the first die is larger than 3. Are A and B independent? 2. A game is played in which two fair six sided die are repeatedly rolled until the sum is 7 or 8(an example from the dice game Craps). Let Y denote the number of rolls required for the game to finish. (a) Compute E(Y ) and V ar(Y ). (Hint: Y is a member of a parametric family of random variables, so you can look these up in our course notes or wikipedia.) (b) Repeat these calculations if instead of the game ending on a 7 or 8 being rolled, it ends on a 7 or 9, or it ends on a 7 or 10? 3. A lottery game is played in which 6 names are drawn out of a bin containing 50 names. Suppose every subset of 6 names is equally likely to be drawn. 5 members of the local Koala Club are participating in the lottery. (a) What is the probability that at least two people from the Koala Club have their names drawn? (b) Suppose that the first name drawn wins 100 dollars, the second name drawn wins 50 dollars, and the last 4 names drawn each win 25 dollars. What is the probability that the members of the Koala Club collectively win at least 100 dollars? 4. Marcel writes programs for a software developer. His programs are sometimes flawed, and are therefore checked by a second programmer after they are completed. If Marcel’s program contains a flaw, the probability that it is discovered by the second programmer is 88%. If the program is not flawed, the probability that the second programmer mistakenly identifies a flaw in the program is 11%. It is known that 8% of Marcel’s programs contain flaws. Suppose Marcel sends a program in for inspection. (a) What is the probability that the second programmer finds a flaw with Marcel’s program? (b) What is the probability that Marcel’s program actually contains a flaw given that the second programmer found a flaw in the program? (c) What is the probability that Marcel’s program actually contains a flaw given that the second programmer could not find a flaw with the program? 5. Suppose X is a continuous random variable with probability density function f (x) = cx3 if 0 ≤ x ≤ 1, 0 otherwise, for some positive constant c. (a) Compute c so that the function above is a valid probability density function. (b) Compute the CDF F (x) for X. Use this to compute the probability that X > .75. (c) Compute E(X) (d) Compute V ar(X) R Problems 1. Read in the data set “billionaires.txt” from the course webpage. Fortune magazine published this summary data about all billionaires in the world 1992 which comprised data from over 200 wealthy individuals. Their wealth, age and geographic location (Asia, Europe, Middle East, United States or Other) are reported. Age is measured in years and wealth is measured in billions of dollars. (a) Make a pie chart of the location of billionaires in the data set. The pie chart should be clearly labeled. (b) Make a scatter plot of the age of the billionaires vs. their wealth. Clearly label the plot. Also fit a least squares regression line to the data. (c) Make a histogram and a boxplot of the wealth of the billionaires. Identify the location of any outlier billionaires. 2. Write a function in R called “DieRoll(n,d)” which takes two inputs: a positive integer n, and a positive integer d; and outputs a simulated sequence of length n of fair die rolls from a dice with d sides. Report the output of running “DieRoll(50,6)”, which simulates 50 rolls of a fair six sided die. 3. Read in the data set “clouds.txt” from the course webpage. It is believed that seeding clouds with silver nitrate increases the amount of precipitation of clouds. A study was performed in which, of 52 clouds, 26 were randomly selected to be seeded with silver nitrate. The precipitation of the 52 clouds was then measured and reported. (a) What hypotheses do you suggest in order to test the claim that seeding clouds with silver nitrate increases the amount of precipitation? (b) Test these hypotheses using R and report the p-value of the test and your interpretation of the results. Also create a side by side box-plot of the two groups. (c) Are the hypotheses of the test you used satisfied by this data set? Defend your argument with appropriate pictures or by referencing the appropriate theorems. 4. Write a function “divisibleby(numbers,divisor)” in R which takes two inputs:“numbers” which is a vector of positive integers that are checked for divisibility, and “divisor” which is a positive integer; and outputs those members of the vector “numbers” which are divisible by “divisor”. Report the results of running “divisibleby((1:100),7)”.