Final Project

advertisement
Math 5900
Final Project
The goal of this final project is to review the concepts that we discussed in the
course as well as to assess your level of understanding. The project consists of 9
problems, many of which have multiple parts. Although you are allowed to discuss
the problems with your peers, you are required to produce your own solutions
to each problem. Several problems will require you to use R. In these instances
your answer should consist of the code you used to solve the problem as well as any
output generated by R, including pictures, and should be organized using a document
processing software (like Word, LaTex, etc.). I will provide some instruction on R
related questions if asked, however on the theoretical questions I will only reveal
whether or not a proposed answer is correct. The project is due July 22nd by
8pm in my math department mailbox in the JWB faculty lounge or slid under my
office door at JWB 214. You may also email your project to me as an attachment.
Theoretical Problems
1. Suppose 2 fair six sided die are rolled.
(a) Let X denote the sum of the die rolls. Compute the probability mass
function of X.
(b) Compute E(X) and V ar(X).
(c) Let A be the event that the sum of the die rolls is even, and let B be the
event that the first die is larger than 3. Are A and B independent?
2. A game is played in which two fair six sided die are repeatedly rolled until the
sum is 7 or 8(an example from the dice game Craps). Let Y denote the number
of rolls required for the game to finish.
(a) Compute E(Y ) and V ar(Y ). (Hint: Y is a member of a parametric family of random variables, so you can look these up in our course notes or
wikipedia.)
(b) Repeat these calculations if instead of the game ending on a 7 or 8 being
rolled, it ends on a 7 or 9, or it ends on a 7 or 10?
3. A lottery game is played in which 6 names are drawn out of a bin containing
50 names. Suppose every subset of 6 names is equally likely to be drawn. 5
members of the local Koala Club are participating in the lottery.
(a) What is the probability that at least two people from the Koala Club have
their names drawn?
(b) Suppose that the first name drawn wins 100 dollars, the second name
drawn wins 50 dollars, and the last 4 names drawn each win 25 dollars.
What is the probability that the members of the Koala Club collectively
win at least 100 dollars?
4. Marcel writes programs for a software developer. His programs are sometimes
flawed, and are therefore checked by a second programmer after they are completed. If Marcel’s program contains a flaw, the probability that it is discovered
by the second programmer is 88%. If the program is not flawed, the probability
that the second programmer mistakenly identifies a flaw in the program is 11%.
It is known that 8% of Marcel’s programs contain flaws. Suppose Marcel sends
a program in for inspection.
(a) What is the probability that the second programmer finds a flaw with
Marcel’s program?
(b) What is the probability that Marcel’s program actually contains a flaw
given that the second programmer found a flaw in the program?
(c) What is the probability that Marcel’s program actually contains a flaw
given that the second programmer could not find a flaw with the program?
5. Suppose X is a continuous random variable with probability density function
f (x) =
cx3 if 0 ≤ x ≤ 1,
0
otherwise,
for some positive constant c.
(a) Compute c so that the function above is a valid probability density function.
(b) Compute the CDF F (x) for X. Use this to compute the probability that
X > .75.
(c) Compute E(X)
(d) Compute V ar(X)
R Problems
1. Read in the data set “billionaires.txt” from the course webpage. Fortune magazine published this summary data about all billionaires in the world 1992
which comprised data from over 200 wealthy individuals. Their wealth, age and
geographic location (Asia, Europe, Middle East, United States or Other) are
reported. Age is measured in years and wealth is measured in billions of dollars.
(a) Make a pie chart of the location of billionaires in the data set. The pie
chart should be clearly labeled.
(b) Make a scatter plot of the age of the billionaires vs. their wealth. Clearly
label the plot. Also fit a least squares regression line to the data.
(c) Make a histogram and a boxplot of the wealth of the billionaires. Identify
the location of any outlier billionaires.
2. Write a function in R called “DieRoll(n,d)” which takes two inputs: a positive
integer n, and a positive integer d; and outputs a simulated sequence of length
n of fair die rolls from a dice with d sides. Report the output of running
“DieRoll(50,6)”, which simulates 50 rolls of a fair six sided die.
3. Read in the data set “clouds.txt” from the course webpage. It is believed that
seeding clouds with silver nitrate increases the amount of precipitation of clouds.
A study was performed in which, of 52 clouds, 26 were randomly selected to be
seeded with silver nitrate. The precipitation of the 52 clouds was then measured
and reported.
(a) What hypotheses do you suggest in order to test the claim that seeding
clouds with silver nitrate increases the amount of precipitation?
(b) Test these hypotheses using R and report the p-value of the test and your
interpretation of the results. Also create a side by side box-plot of the two
groups.
(c) Are the hypotheses of the test you used satisfied by this data set? Defend
your argument with appropriate pictures or by referencing the appropriate
theorems.
4. Write a function “divisibleby(numbers,divisor)” in R which takes two inputs:“numbers”
which is a vector of positive integers that are checked for divisibility, and “divisor” which is a positive integer; and outputs those members of the vector
“numbers” which are divisible by “divisor”. Report the results of running
“divisibleby((1:100),7)”.
Download